Deep in the Dark - Deep Learning-Based Malware Traffic Detection Without Expert Knowledge
Title | Deep in the Dark - Deep Learning-Based Malware Traffic Detection Without Expert Knowledge |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Mar\'ın, Gonzalo, Casas, Pedro, Capdehourat, Germán |
Conference Name | 2019 IEEE Security and Privacy Workshops (SPW) |
Date Published | may |
ISBN Number | 978-1-7281-3508-3 |
Keywords | Analytical models, classification, Computer architecture, computer network security, Data models, Deep Learning, deep learning models, deep learning-based malware traffic detection, expert handcrafted features, expert systems, Expert Systems and Privacy, feature extraction, Human Behavior, human factors, invasive software, learning (artificial intelligence), machine learning-based systems, Malware, malware detection, malware network traffic, network security applications, network traffic, networking attacks, pattern classification, privacy, pubcrawl, Raw Measurements, raw-traffic feature representations, robust network security systems, Scalability, telecommunication traffic, Training |
Abstract | With the ever-growing occurrence of networking attacks, robust network security systems are essential to prevent and mitigate their harming effects. In recent years, machine learning-based systems have gain popularity for network security applications, usually considering the application of shallow models, where a set of expert handcrafted features are needed to pre-process the data before training. The main problem with this approach is that handcrafted features can fail to perform well given different kinds of scenarios and problems. Deep Learning models can solve this kind of issues using their ability to learn feature representations from input raw or basic, non-processed data. In this paper we explore the power of deep learning models on the specific problem of detection and classification of malware network traffic, using different representations for the input data. As a major advantage as compared to the state of the art, we consider raw measurements coming directly from the stream of monitored bytes as the input to the proposed models, and evaluate different raw-traffic feature representations, including packet and flow-level ones. Our results suggest that deep learning models can better capture the underlying statistics of malicious traffic as compared to classical, shallow-like models, even while operating in the dark, i.e., without any sort of expert handcrafted inputs. |
URL | https://ieeexplore.ieee.org/document/8844609 |
DOI | 10.1109/SPW.2019.00019 |
Citation Key | marin_deep_2019 |
- machine learning-based systems
- Training
- telecommunication traffic
- Scalability
- robust network security systems
- raw-traffic feature representations
- Raw Measurements
- pubcrawl
- privacy
- pattern classification
- networking attacks
- network traffic
- network security applications
- malware network traffic
- malware detection
- malware
- Expert Systems and Privacy
- learning (artificial intelligence)
- invasive software
- Human Factors
- feature extraction
- expert systems
- expert handcrafted features
- deep learning-based malware traffic detection
- deep learning models
- deep learning
- Data models
- computer network security
- computer architecture
- classification
- Analytical models
- Human behavior