Data-Driven Insights from Vulnerability Discovery Metrics
Title | Data-Driven Insights from Vulnerability Discovery Metrics |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Munaiah, Nuthan, Meneely, Andrew |
Conference Name | 2019 IEEE/ACM Joint 4th International Workshop on Rapid Continuous Software Engineering and 1st International Workshop on Data-Driven Decisions, Experimentation and Evolution (RCoSE/DDrEE) |
Date Published | May 2019 |
Publisher | IEEE |
ISBN Number | 978-1-7281-2247-2 |
Keywords | application domain, Chromium project, data-driven insights, I-O Systems, i-o systems security, interpretation, metric, natural language feedback, programming language, pubcrawl, security, security metrics, security of data, software metrics, threshold, Vulnerability, vulnerability discovery metrics |
Abstract | Software metrics help developers discover and fix mistakes. However, despite promising empirical evidence, vulnerability discovery metrics are seldom relied upon in practice. In prior research, the effectiveness of these metrics has typically been expressed using precision and recall of a prediction model that uses the metrics as explanatory variables. These prediction models, being black boxes, may not be perceived as useful by developers. However, by systematically interpreting the models and metrics, we can provide developers with nuanced insights about factors that have led to security mistakes in the past. In this paper, we present a preliminary approach to using vulnerability discovery metrics to provide insightful feedback to developers as they engineer software. We collected ten metrics (churn, collaboration centrality, complexity, contribution centrality, nesting, known offender, source lines of code, \# inputs, \# outputs, and \# paths) from six open-source projects. We assessed the generalizability of the metrics across two contextual dimensions (application domain and programming language) and between projects within a domain, computed thresholds for the metrics using an unsupervised approach from literature, and assessed the ability of these unsupervised thresholds to classify risk from historical vulnerabilities in the Chromium project. The observations from this study feeds into our ongoing research to automatically aggregate insights from the various analyses to generate natural language feedback on security. We hope that our approach to generate automated feedback will accelerate the adoption of research in vulnerability discovery metrics. |
URL | https://ieeexplore.ieee.org/document/8818181 |
DOI | 10.1109/RCoSE/DDrEE.2019.00008 |
Citation Key | munaiah_data-driven_2019 |