Visible to the public Static Analysis with Paragraph Vector for Malware Detection

TitleStatic Analysis with Paragraph Vector for Malware Detection
Publication TypeConference Paper
Year of Publication2017
AuthorsNagano, Yuta, Uda, Ryuya
Conference NameProceedings of the 11th International Conference on Ubiquitous Information Management and Communication
Date PublishedJanuary 2017
PublisherACM
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-4888-1
KeywordsHuman Behavior, k-nearest neighbor algorithm, machine learning, Malware, malware analysis, Metrics, paragraph vector, privacy, pubcrawl, Resiliency, static analysis, support vector machine, threat vectors
Abstract

Malware damages computers and the threat is a serious problem. Malware can be detected by pattern matching method or dynamic heuristic method. However, it is difficult to detect all new malware subspecies perfectly by existing methods. In this paper, we propose a new method which automatically detects new malware subspecies by static analysis of execution files and machine learning. The method can distinguish malware from benignware and it can also classify malware subspecies into malware families. We combine static analysis of execution files with machine learning classifier and natural language processing by machine learning. Information of DLL Import, assembly code and hexdump are acquired by static analysis of execution files of malware and benignware to create feature vectors. Paragraph vectors of information by static analysis of execution files are created by machine learning of PV-DBOW model for natural language processing. Support vector machine and classifier of k-nearest neighbor algorithm are used in our method, and the classifier learns paragraph vectors of information by static analysis. Unknown execution files are classified into malware or benignware by pre-learned SVM. Moreover, malware subspecies are also classified into malware families by pre-learned k-nearest. We evaluate the accuracy of the classification by experiments. We think that new malware subspecies can be effectively detected by our method without existing methods for malware analysis such as generic method and dynamic heuristic method.

URLhttps://dl.acm.org/doi/10.1145/3022227.3022306
DOI10.1145/3022227.3022306
Citation Keynagano_static_2017