Title | A Novel Machine Learning Based Malware Detection and Classification Framework |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Sethi, Kamalakanta, Kumar, Rahul, Sethi, Lingaraj, Bera, Padmalochan, Patra, Prashanta Kumar |
Conference Name | 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security) |
Keywords | accurate malware detection, analysis report, classification accuracy, classification framework, complex malware types, computer systems, cuckoo sandbox, dynamic analysis, feature extraction, feature selection, feature selection algorithms, fine-grained classification, high detection, Human Behavior, invasive software, learning (artificial intelligence), machine learning, machine learning algorithms, machine learning models, Malware, malware analysis, malware analysis framework, malware classification, malware detection, malware files, malware samples, Metrics, minimum computation cost, pattern classification, Predictive Metrics, privacy, pubcrawl, Resiliency, selection module, signature-based malware detection techniques, static and dynamic analysis, system activities, Testing, time progresses, Training, Virtual machining |
Abstract | As time progresses, new and complex malware types are being generated which causes a serious threat to computer systems. Due to this drastic increase in the number of malware samples, the signature-based malware detection techniques cannot provide accurate results. Different studies have demonstrated the proficiency of machine learning for the detection and classification of malware files. Further, the accuracy of these machine learning models can be improved by using feature selection algorithms to select the most essential features and reducing the size of the dataset which leads to lesser computations. In this paper, we have developed a machine learning based malware analysis framework for efficient and accurate malware detection and classification. We used Cuckoo sandbox for dynamic analysis which executes malware in an isolated environment and generates an analysis report based on the system activities during execution. Further, we propose a feature extraction and selection module which extracts features from the report and selects the most important features for ensuring high accuracy at minimum computation cost. Then, we employ different machine learning algorithms for accurate detection and fine-grained classification. Experimental results show that we got high detection and classification accuracy in comparison to the state-of-the-art approaches. |
DOI | 10.1109/CyberSecPODS.2019.8885196 |
Citation Key | sethi_novel_2019 |