Visible to the public Classification of Malware programs using autoencoders based deep learning architecture and its application to the microsoft malware Classification challenge (BIG 2015) dataset

TitleClassification of Malware programs using autoencoders based deep learning architecture and its application to the microsoft malware Classification challenge (BIG 2015) dataset
Publication TypeConference Paper
Year of Publication2017
AuthorsKebede, T. M., Djaneye-Boundjou, O., Narayanan, B. N., Ralescu, A., Kapp, D.
Conference Name2017 IEEE National Aerospace and Electronics Conference (NAECON)
KeywordsAcoustics, autoencoder, classification process, classifier, cybersecurity, Deep Learning, deep learning architecture, Devices, Global Positioning System, gray scale images, Human Behavior, invasive software, Latches, learning (artificial intelligence), malicious programs, malware classification, malware programs, Metrics, microcontrollers, Microsoft Malware Classification Challenge BIG2015, microsoft malware Classification challenge dataset, Neural Network, pattern classification, Pattern recognition, pattern recognition algorithms, principle component analysis, privacy, pubcrawl, resilience, Resiliency, security, Sensors, superior architecture, Time factors
Abstract

Distinguishing and classifying different types of malware is important to better understanding how they can infect computers and devices, the threat level they pose and how to protect against them. In this paper, a system for classifying malware programs is presented. The paper describes the architecture of the system and assesses its performance on a publicly available database (provided by Microsoft for the Microsoft Malware Classification Challenge BIG2015) to serve as a benchmark for future research efforts. First, the malicious programs are preprocessed such that they are visualized as gray scale images. We then make use of an architecture comprised of multiple layers (multiple levels of encoding) to carry out the classification process of those images/programs. We compare the performance of this approach against traditional machine learning and pattern recognition algorithms. Our experimental results show that the deep learning architecture yields a boost in performance over those conventional/standard algorithms. A hold-out validation analysis using the superior architecture shows an accuracy in the order of 99.15%.

URLhttps://ieeexplore.ieee.org/document/8268747/
DOI10.1109/NAECON.2017.8268747
Citation Keykebede_classification_2017