Android Malware Family Classification Based on Sensitive Opcode Sequence
Title | Android Malware Family Classification Based on Sensitive Opcode Sequence |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Jiang, Jianguo, Li, Song, Yu, Min, Li, Gang, Liu, Chao, Chen, Kai, Liu, Hui, Huang, Weiqing |
Conference Name | 2019 IEEE Symposium on Computers and Communications (ISCC) |
Date Published | July 2019 |
Publisher | IEEE |
Keywords | Android (operating system), Android malware, Android malware analysis, Android Malware Detection, Android malware family classification model, Android malware forensics, application program interfaces, code specific semantic information, digital forensics, Drebin dataset, family classification, feature extraction, Human Behavior, invasive software, learning (artificial intelligence), malware classification, Metrics, mobile computing, multiple class classification, oversampling technique, pattern classification, privacy, pubcrawl, resilience, Resiliency, Semantic, semantic related vector, sensitive API, sensitive opcode, sensitive semantic feature-sensitive opcode sequence |
Abstract | Android malware family classification is an advanced task in Android malware analysis, detection and forensics. Existing methods and models have achieved a certain success for Android malware detection, but the accuracy and the efficiency are still not up to the expectation, especially in the context of multiple class classification with imbalanced training data. To address those challenges, we propose an Android malware family classification model by analyzing the code's specific semantic information based on sensitive opcode sequence. In this work, we construct a sensitive semantic feature-sensitive opcode sequence using opcodes, sensitive APIs, STRs and actions, and propose to analyze the code's specific semantic information, generate a semantic related vector for Android malware family classification based on this feature. Besides, aiming at the families with minority, we adopt an oversampling technique based on the sensitive opcode sequence. Finally, we evaluate our method on Drebin dataset, and select the top 40 malware families for experiments. The experimental results show that the Total Accuracy and Average AUC (Area Under Curve, AUC) reach 99.50% and 98.86% with 45. 17s per Android malware, and even if the number of malware families increases, these results remain good. |
URL | https://ieeexplore.ieee.org/document/8969656 |
DOI | 10.1109/ISCC47284.2019.8969656 |
Citation Key | jiang_android_2019 |
- malware classification
- sensitive semantic feature-sensitive opcode sequence
- sensitive opcode
- sensitive API
- semantic related vector
- Semantic
- Resiliency
- resilience
- pubcrawl
- privacy
- pattern classification
- oversampling technique
- multiple class classification
- mobile computing
- Metrics
- Android (operating system)
- learning (artificial intelligence)
- invasive software
- Human behavior
- feature extraction
- family classification
- Drebin dataset
- Digital Forensics
- code specific semantic information
- application program interfaces
- Android malware forensics
- Android malware family classification model
- Android Malware Detection
- Android malware analysis
- Android malware