Text Mining for Malware Classification Using Multivariate All Repeated Patterns Detection

Submitted by grigby1 on Thu, 10/29/2020 - 11:12am

Title	Text Mining for Malware Classification Using Multivariate All Repeated Patterns Detection
Publication Type	Conference Paper
Year of Publication	2019
Authors	Xylogiannopoulos, Konstantinos F., Karampelas, Panagiotis, Alhajj, Reda
Conference Name	2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)
Date Published	August 2019
Publisher	ACM
Keywords	Android Malware Detection, Androids, ARPaD, bank accounts, cyber criminals, data mining, Human Behavior, Humanoid robots, Internet, invasive software, legitimate application, LERP-RSA, Malware, malware classification, malware dataset, malware detection, malware families, malware family classification, Metrics, mobile computing, mobile devices, Mobile handsets, mobile phones, mobile users, multivariate all repeated pattern detection, Payloads, privacy, pubcrawl, randomly selected malware applications, resilience, Resiliency, smart phones, smartphone users, Social network services, text mining
Abstract	Mobile phones have become nowadays a commodity to the majority of people. Using them, people are able to access the world of Internet and connect with their friends, their colleagues at work or even unknown people with common interests. This proliferation of the mobile devices has also been seen as an opportunity for the cyber criminals to deceive smartphone users and steel their money directly or indirectly, respectively, by accessing their bank accounts through the smartphones or by blackmailing them or selling their private data such as photos, credit card data, etc. to third parties. This is usually achieved by installing malware to smartphones masking their malevolent payload as a legitimate application and advertise it to the users with the hope that mobile users will install it in their devices. Thus, any existing application can easily be modified by integrating a malware and then presented it as a legitimate one. In response to this, scientists have proposed a number of malware detection and classification methods using a variety of techniques. Even though, several of them achieve relatively high precision in malware classification, there is still space for improvement. In this paper, we propose a text mining all repeated pattern detection method which uses the decompiled files of an application in order to classify a suspicious application into one of the known malware families. Based on the experimental results using a real malware dataset, the methodology tries to correctly classify (without any misclassification) all randomly selected malware applications of 3 categories with 3 different families each.
URL	https://dl.acm.org/doi/10.1145/3341161.3350841
DOI	10.1145/3341161.3350841
Citation Key	xylogiannopoulos_text_2019

Groups:

Science of Security VO