Using Software Metrics for Predicting Vulnerable Code-Components: A Study on Java and Python Open Source Projects
Title | Using Software Metrics for Predicting Vulnerable Code-Components: A Study on Java and Python Open Source Projects |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Chong, T., Anu, V., Sultana, K. Z. |
Conference Name | 2019 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC) |
Keywords | code component, code-component, Java, Java projects, Java vulnerable functions, learning (artificial intelligence), machine learning, Measurement, Metrics, metrics testing, pubcrawl, public domain software, python, Python open source projects, Python vulnerable function prediction, safety-critical software, security, security of data, security vulnerabilities, software metrics, software metrics-based vulnerability prediction, software projects, software reliability, software security, Testing, Vulnerability prediction, vulnerability prediction performance, vulnerability predictors |
Abstract | Software vulnerabilities often remain hidden until an attacker exploits the weak/insecure code. Therefore, testing the software from a vulnerability discovery perspective becomes challenging for developers if they do not inspect their code thoroughly (which is time-consuming). We propose that vulnerability prediction using certain software metrics can support the testing process by identifying vulnerable code-components (e.g., functions, classes, etc.). Once a code-component is predicted as vulnerable, the developers can focus their testing efforts on it, thereby avoiding the time/effort required for testing the entire application. The current paper presents a study that compares how software metrics perform as vulnerability predictors for software projects developed in two different languages (Java vs Python). The goal of this research is to analyze the vulnerability prediction performance of software metrics for different programming languages. We designed and conducted experiments on security vulnerabilities reported for three Java projects (Apache Tomcat 6, Tomcat 7, Apache CXF) and two Python projects (Django and Keystone). In this paper, we focus on a specific type of code component: Functions. We apply Machine Learning models for predicting vulnerable functions. Overall results show that software metrics-based vulnerability prediction is more useful for Java projects than Python projects (i.e., software metrics when used as features were able to predict Java vulnerable functions with a higher recall and precision compared to Python vulnerable functions prediction). |
DOI | 10.1109/CSE/EUC.2019.00028 |
Citation Key | chong_using_2019 |
- Python vulnerable function prediction
- vulnerability predictors
- vulnerability prediction performance
- Vulnerability prediction
- testing
- software security
- software reliability
- software projects
- software metrics-based vulnerability prediction
- software metrics
- security vulnerabilities
- security of data
- security
- safety-critical software
- code component
- Python open source projects
- Python
- public domain software
- pubcrawl
- Metrics
- metrics testing
- Measurement
- machine learning
- learning (artificial intelligence)
- Java vulnerable functions
- Java projects
- Java
- code-component