Towards Efficient Malware Detection and Classification using Multilayered Random Forest Ensemble Technique

Submitted by grigby1 on Thu, 10/29/2020 - 11:12am

Title	Towards Efficient Malware Detection and Classification using Multilayered Random Forest Ensemble Technique
Publication Type	Conference Paper
Year of Publication	2019
Authors	Roseline, S. Abijah, Sasisri, A. D., Geetha, S., Balasubramanian, C.
Conference Name	2019 International Carnahan Conference on Security Technology (ICCST)
Publisher	IEEE
ISBN Number	978-1-7281-1576-4
Keywords	Adaptation models, Computational modeling, cyber security, deep learning models, deep neural networks, Ensemble forest, feature extraction, Forestry, Gray-scale, Human Behavior, hybrid model, invasive software, learning (artificial intelligence), Malware, malware authors, malware classification, malware detection, malware images, malware patterns, Metrics, Microsoft Windows, multilayered random forest ensemble technique, pattern classification, privacy, pubcrawl, resilience, Resiliency, traditional malware, Vision-based malware analysis
Abstract	The exponential growth rate of malware causes significant security concern in this digital era to computer users, private and government organizations. Traditional malware detection methods employ static and dynamic analysis, which are ineffective in identifying unknown malware. Malware authors develop new malware by using polymorphic and evasion techniques on existing malware and escape detection. Newly arriving malware are variants of existing malware and their patterns can be analyzed using the vision-based method. Malware patterns are visualized as images and their features are characterized. The alternative generation of class vectors and feature vectors using ensemble forests in multiple sequential layers is performed for classifying malware. This paper proposes a hybrid stacked multilayered ensembling approach which is robust and efficient than deep learning models. The proposed model outperforms the machine learning and deep learning models with an accuracy of 98.91%. The proposed system works well for small-scale and large-scale data since its adaptive nature of setting parameters (number of sequential levels) automatically. It is computationally efficient in terms of resources and time. The method uses very fewer hyper-parameters compared to deep neural networks.
URL	https://ieeexplore.ieee.org/document/8888406/
DOI	10.1109/CCST.2019.8888406
Citation Key	roseline_towards_2019

Groups:

Science of Security VO