Visible to the public Detection of Malware using Machine Learning based on Operation Code Frequency

TitleDetection of Malware using Machine Learning based on Operation Code Frequency
Publication TypeConference Paper
Year of Publication2021
AuthorsMohandas, Pavitra, Santhosh Kumar, Sudesh Kumar, Kulyadi, Sandeep Pai, Shankar Raman, M J, S, Vasan V, Venkataswami, Balaji
Conference Name2021 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)
Date Publishedjul
KeywordsConferences, Frequency conversion, Human Behavior, machine learning, machine learning algorithms, Malware, malware analysis, malware detection, metamorphic malware, Metrics, naive Bayes methods, Opcode Frequency, Predictive models, privacy, pubcrawl, resilience, Resiliency, Training
AbstractOne of the many methods for identifying malware is to disassemble the malware files and obtain the opcodes from them. Since malware have predominantly been found to contain specific opcode sequences in them, the presence of the same sequences in any incoming file or network content can be taken up as a possible malware identification scheme. Malware detection systems help us to understand more about ways on how malware attack a system and how it can be prevented. The proposed method analyses malware executable files with the help of opcode information by converting the incoming executable files to assembly language thereby extracting opcode information (opcode count) from the same. The opcode count is then converted into opcode frequency which is stored in a CSV file format. The CSV file is passed to various machine learning algorithms like Decision Tree Classifier, Random Forest Classifier and Naive Bayes Classifier. Random Forest Classifier produced the highest accuracy and hence the same model was used to predict whether an incoming file contains a potential malware or not.
DOI10.1109/IAICT52856.2021.9532521
Citation Keymohandas_detection_2021