Fast Model Learning for the Detection of Malicious Digital Documents

Submitted by grigby1 on Fri, 02/02/2018 - 12:28pm

Title	Fast Model Learning for the Detection of Malicious Digital Documents
Publication Type	Conference Paper
Year of Publication	2017
Authors	Scofield, Daniel, Miles, Craig, Kuhn, Stephen
Conference Name	Proceedings of the 7th Software Security, Protection, and Reverse Engineering / Software Security and Protection Workshop
Publisher	ACM
Conference Location	New York, NY, USA
ISBN Number	978-1-4503-5387-8
Keywords	anomaly detection, composability, dynamic analysis, malware classification, malware detection, pubcrawl, Scalability, software assurance
Abstract	Modern cyber attacks are often conducted by distributing digital documents that contain malware. The approach detailed herein, which consists of a classifier that uses features derived from dynamic analysis of a document viewer as it renders the document in question, is capable of classifying the disposition of digital documents with greater than 98% accuracy even when its model is trained on just small amounts of data. To keep the classification model itself small and thereby to provide scalability, we employ an entity resolution strategy that merges syntactically disparate features that are thought to be semantically equivalent but vary due to programmatic randomness. Entity resolution enables construction of a comprehensive model of benign functionality using relatively few training documents, and the model does not improve significantly with additional training data.
URL	http://doi.acm.org/10.1145/3151137.3151142
DOI	10.1145/3151137.3151142
Citation Key	scofield_fast_2017

Groups:

Science of Security VO