Identifying Encrypted Malware Traffic with Contextual Flow Data

Submitted by grigby1 on Tue, 05/30/2017 - 1:11pm

Title	Identifying Encrypted Malware Traffic with Contextual Flow Data
Publication Type	Conference Paper
Year of Publication	2016
Authors	Anderson, Blake, McGrew, David
Conference Name	Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security
Publisher	ACM
Conference Location	New York, NY, USA
ISBN Number	978-1-4503-4573-6
Keywords	artificial intelligence security, composability, Computational Intelligence, Encryption, machine learning, Malware, malware classification, Metrics, network monitoring, pubcrawl, Resiliency, Transport Layer Security, windows operating systems security
Abstract	Identifying threats contained within encrypted network traffic poses a unique set of challenges. It is important to monitor this traffic for threats and malware, but do so in a way that maintains the integrity of the encryption. Because pattern matching cannot operate on encrypted data, previous approaches have leveraged observable metadata gathered from the flow, e.g., the flow's packet lengths and inter-arrival times. In this work, we extend the current state-of-the-art by considering a data omnia approach. To this end, we develop supervised machine learning models that take advantage of a unique and diverse set of network flow data features. These data features include TLS handshake metadata, DNS contextual flows linked to the encrypted flow, and the HTTP headers of HTTP contextual flows from the same source IP address within a 5 minute window. We begin by exhibiting the differences between malicious and benign traffic's use of TLS, DNS, and HTTP on millions of unique flows. This study is used to design the feature sets that have the most discriminatory power. We then show that incorporating this contextual information into a supervised learning system significantly increases performance at a 0.00% false discovery rate for the problem of classifying encrypted, malicious flows. We further validate our false positive rate on an independent, real-world dataset.
URL	http://doi.acm.org/10.1145/2996758.2996768
DOI	10.1145/2996758.2996768
Citation Key	anderson_identifying_2016

Groups:

Science of Security VO