Visible to the public Metadata Recovery from Obfuscated Programs Using Machine Learning

TitleMetadata Recovery from Obfuscated Programs Using Machine Learning
Publication TypeConference Paper
Year of Publication2016
AuthorsSalem, Aleieldin, Banescu, Sebastian
Conference NameProceedings of the 6th Workshop on Software Security, Protection, and Reverse Engineering
PublisherACM
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-4841-6
Keywordscomposability, machine learning, obfuscation, privacy, pubcrawl, Resiliency, reverse engineering
Abstract

Obfuscation is a mechanism used to hinder reverse engineering of programs. To cope with the large number of obfuscated programs, especially malware, reverse engineers automate the process of deobfuscation i.e. extracting information from obfuscated programs. Deobfuscation techniques target specific obfuscation transformations, which requires reverse engineers to manually identify the transformations used by a program, in what is known as metadata recovery attack. In this paper, we present Oedipus, a Python framework that uses machine learning classifiers viz., decision trees and naive Bayes, to automate metadata recovery attacks against obfuscated programs. We evaluated Oedipus' performance using two datasets totaling 1960 unobfuscated C programs, which were used to generate 11.075 programs obfuscated using 30 configurations of 6 different obfuscation transformations. Our results empirically show the feasibility of using machine learning to implement the metadata recovery attacks with classification accuracies of 100% in some cases.

URLhttp://doi.acm.org/10.1145/3015135.3015136
DOI10.1145/3015135.3015136
Citation Keysalem_metadata_2016