A Tree-Based Machine Learning Methodology to Automatically Classify Software Vulnerabilities

Submitted by grigby1 on Mon, 04/18/2022 - 2:36pm

Title	A Tree-Based Machine Learning Methodology to Automatically Classify Software Vulnerabilities
Publication Type	Conference Paper
Year of Publication	2021
Authors	Aivatoglou, Georgios, Anastasiadis, Mike, Spanos, Georgios, Voulgaridis, Antonis, Votis, Konstantinos, Tzovaras, Dimitrios
Conference Name	2021 IEEE International Conference on Cyber Security and Resilience (CSR)
Date Published	jul
Keywords	Conferences, cyber-security, Databases, Decision trees, Forestry, gradient boosting, Hardware, Human Behavior, machine learning, Manuals, policy-based governance, pubcrawl, random forests, resilience, Resiliency, security, security weaknesses, Software, Software Vulnerability categorization
Abstract	Software vulnerabilities have become a major problem for the security analysts, since the number of new vulnerabilities is constantly growing. Thus, there was a need for a categorization system, in order to group and handle these vulnerabilities in a more efficient way. Hence, the MITRE corporation introduced the Common Weakness Enumeration that is a list of the most common software and hardware vulnerabilities. However, the manual task of understanding and analyzing new vulnerabilities by security experts, is a very slow and exhausting process. For this reason, a new automated classification methodology is introduced in this paper, based on the vulnerability textual descriptions from National Vulnerability Database. The proposed methodology, combines textual analysis and tree-based machine learning techniques in order to classify vulnerabilities automatically. The results of the experiments showed that the proposed methodology performed pretty well achieving an overall accuracy close to 80%.
DOI	10.1109/CSR51186.2021.9527965
Citation Key	aivatoglou_tree-based_2021

Groups:

Science of Security VO