Text Mining for Malware Classification Using Multivariate All Repeated Patterns Detection
Title | Text Mining for Malware Classification Using Multivariate All Repeated Patterns Detection |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Xylogiannopoulos, Konstantinos F., Karampelas, Panagiotis, Alhajj, Reda |
Conference Name | 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) |
Date Published | August 2019 |
Publisher | ACM |
Keywords | Android Malware Detection, Androids, ARPaD, bank accounts, cyber criminals, data mining, Human Behavior, Humanoid robots, Internet, invasive software, legitimate application, LERP-RSA, Malware, malware classification, malware dataset, malware detection, malware families, malware family classification, Metrics, mobile computing, mobile devices, Mobile handsets, mobile phones, mobile users, multivariate all repeated pattern detection, Payloads, privacy, pubcrawl, randomly selected malware applications, resilience, Resiliency, smart phones, smartphone users, Social network services, text mining |
Abstract | Mobile phones have become nowadays a commodity to the majority of people. Using them, people are able to access the world of Internet and connect with their friends, their colleagues at work or even unknown people with common interests. This proliferation of the mobile devices has also been seen as an opportunity for the cyber criminals to deceive smartphone users and steel their money directly or indirectly, respectively, by accessing their bank accounts through the smartphones or by blackmailing them or selling their private data such as photos, credit card data, etc. to third parties. This is usually achieved by installing malware to smartphones masking their malevolent payload as a legitimate application and advertise it to the users with the hope that mobile users will install it in their devices. Thus, any existing application can easily be modified by integrating a malware and then presented it as a legitimate one. In response to this, scientists have proposed a number of malware detection and classification methods using a variety of techniques. Even though, several of them achieve relatively high precision in malware classification, there is still space for improvement. In this paper, we propose a text mining all repeated pattern detection method which uses the decompiled files of an application in order to classify a suspicious application into one of the known malware families. Based on the experimental results using a real malware dataset, the methodology tries to correctly classify (without any misclassification) all randomly selected malware applications of 3 categories with 3 different families each. |
URL | https://dl.acm.org/doi/10.1145/3341161.3350841 |
DOI | 10.1145/3341161.3350841 |
Citation Key | xylogiannopoulos_text_2019 |
- privacy
- Metrics
- mobile computing
- mobile devices
- Mobile handsets
- mobile phones
- mobile users
- multivariate all repeated pattern detection
- Payloads
- malware family classification
- pubcrawl
- randomly selected malware applications
- resilience
- Resiliency
- smart phones
- smartphone users
- Social network services
- Text Mining
- invasive software
- Androids
- ARPaD
- bank accounts
- cyber criminals
- Data mining
- Human behavior
- Humanoid robots
- internet
- Android Malware Detection
- legitimate application
- LERP-RSA
- malware
- malware classification
- malware dataset
- malware detection
- malware families