Visible to the public Automatic labeling of the elements of a vulnerability report CVE with NLP

TitleAutomatic labeling of the elements of a vulnerability report CVE with NLP
Publication TypeConference Paper
Year of Publication2022
AuthorsSumoto, Kensuke, Kanakogi, Kenta, Washizaki, Hironori, Tsuda, Naohiko, Yoshioka, Nobukazu, Fukazawa, Yoshiaki, Kanuka, Hideyuki
Conference Name2022 IEEE 23rd International Conference on Information Reuse and Integration for Data Science (IRI)
KeywordsBERT, composability, compositionality, CVE, Data Science, Databases, distortion, Information Reuse, machine learning, named entity recognition, natural language processing, pubcrawl, resilience, Resiliency, security, security knowledge repository, Software, Technological, Transformers
AbstractCommon Vulnerabilities and Exposures (CVE) databases contain information about vulnerabilities of software products and source code. If individual elements of CVE descriptions can be extracted and structured, then the data can be used to search and analyze CVE descriptions. Herein we propose a method to label each element in CVE descriptions by applying Named Entity Recognition (NER). For NER, we used BERT, a transformer-based natural language processing model. Using NER with machine learning can label information from CVE descriptions even if there are some distortions in the data. An experiment involving manually prepared label information for 1000 CVE descriptions shows that the labeling accuracy of the proposed method is about 0.81 for precision and about 0.89 for recall. In addition, we devise a way to train the data by dividing it into labels. Our proposed method can be used to label each element automatically from CVE descriptions.
DOI10.1109/IRI54793.2022.00045
Citation Keysumoto_automatic_2022