Visible to the public Malware Classification Using Byte Sequence Information

TitleMalware Classification Using Byte Sequence Information
Publication TypeConference Paper
Year of Publication2018
AuthorsJung, Byungho, Kim, Taeguen, Im, Eul Gyu
Conference NameProceedings of the 2018 Conference on Research in Adaptive and Convergent Systems
Date PublishedOctober 2018
PublisherACM
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-5885-9
KeywordsCNN, Deep Learning, expert systems, Human Behavior, malware classification, pubcrawl, resilience, Resiliency, Scalability, security, static analysis
Abstract

The number of new malware and new malware variants have been increasing continuously. Security experts analyze malware to capture the malicious properties of malware and to generate signatures or detection rules, but the analysis overheads keep increasing with the increasing number of malware. To analyze a large amount of malware, various kinds of automatic analysis methods are in need. Recently, deep learning techniques such as convolutional neural network (CNN) and recurrent neural network (RNN) have been applied for malware classifications. The features used in the previous approches are mostly based on API (Application Programming Interface) information, and the API invocation information can be obtained through dynamic analysis. However, the invocation information may not reflect malicious behaviors of malware because malware developers use various analysis avoidance techniques. Therefore, deep learning-based malware analysis using other features still need to be developed to improve malware analysis performance. In this paper, we propose a malware classification method using the deep learning algorithm based on byte information. Our proposed method uses images generated from malware byte information that can reflect malware behavioral context, and the convolutional neural network-based sentence analysis is used to process the generated images. We performed several experiments to show the effecitveness of our proposed method, and the experimental results show that our method showed higher accuracy than the naive CNN model, and the detection accuracy was about 99%.

URLhttps://dl.acm.org/doi/10.1145/3264746.3264775
DOI10.1145/3264746.3264775
Citation Keyjung_malware_2018