Predicting buffer overflow using semi-supervised learning
Title | Predicting buffer overflow using semi-supervised learning |
Publication Type | Conference Paper |
Year of Publication | 2016 |
Authors | Meng, Q., Shameng, Wen, Chao, Feng, Chaojing, Tang |
Conference Name | 2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI) |
Keywords | 22-dimension vector extraction, Antlr, Arrays, AST, buffer overflow, buffer overflow vulnerability prediction, Buffer overflows, Buffer storage, C/C++ source files, classifier training, Clustering algorithms, Complexity theory, compositionality, Human Behavior, human factors, Indexes, learning (artificial intelligence), machine learning, Metrics, pattern classification, pubcrawl, Resiliency, security of data, semi-supervised learning, Semisupervised learning, software security, Taxonomy, vulnerability detection |
Abstract | As everyone knows vulnerability detection is a very difficult and time consuming work, so taking advantage of the unlabeled data sufficiently is needed and helpful. According the above reality, in this paper a method is proposed to predict buffer overflow based on semi-supervised learning. We first employ Antlr to extract AST from C/C++ source files, then according to the 22 buffer overflow attributes taxonomies, a 22-dimension vector is extracted from every function in AST, at last, the vector is leveraged to train a classifier to predict buffer overflow vulnerabilities. The experiment and evaluation indicate our method is correct and efficient. |
URL | https://ieeexplore.ieee.org/document/7853039 |
DOI | 10.1109/CISP-BMEI.2016.7853039 |
Citation Key | meng_predicting_2016 |
- Human Factors
- vulnerability detection
- taxonomy
- software security
- Semisupervised learning
- semi-supervised learning
- security of data
- Resiliency
- pubcrawl
- pattern classification
- Metrics
- machine learning
- learning (artificial intelligence)
- Indexes
- 22-dimension vector extraction
- Human behavior
- Compositionality
- Complexity theory
- Clustering algorithms
- classifier training
- C/C++ source files
- Buffer storage
- Buffer overflows
- buffer overflow vulnerability prediction
- buffer overflow
- AST
- arrays
- Antlr