Visible to the public A Tree-Based Machine Learning Methodology to Automatically Classify Software Vulnerabilities

TitleA Tree-Based Machine Learning Methodology to Automatically Classify Software Vulnerabilities
Publication TypeConference Paper
Year of Publication2021
AuthorsAivatoglou, Georgios, Anastasiadis, Mike, Spanos, Georgios, Voulgaridis, Antonis, Votis, Konstantinos, Tzovaras, Dimitrios
Conference Name2021 IEEE International Conference on Cyber Security and Resilience (CSR)
Date Publishedjul
KeywordsConferences, cyber-security, Databases, Decision trees, Forestry, gradient boosting, Hardware, Human Behavior, machine learning, Manuals, policy-based governance, pubcrawl, random forests, resilience, Resiliency, security, security weaknesses, Software, Software Vulnerability categorization
AbstractSoftware vulnerabilities have become a major problem for the security analysts, since the number of new vulnerabilities is constantly growing. Thus, there was a need for a categorization system, in order to group and handle these vulnerabilities in a more efficient way. Hence, the MITRE corporation introduced the Common Weakness Enumeration that is a list of the most common software and hardware vulnerabilities. However, the manual task of understanding and analyzing new vulnerabilities by security experts, is a very slow and exhausting process. For this reason, a new automated classification methodology is introduced in this paper, based on the vulnerability textual descriptions from National Vulnerability Database. The proposed methodology, combines textual analysis and tree-based machine learning techniques in order to classify vulnerabilities automatically. The results of the experiments showed that the proposed methodology performed pretty well achieving an overall accuracy close to 80%.
DOI10.1109/CSR51186.2021.9527965
Citation Keyaivatoglou_tree-based_2021