Data-Driven Insights from Vulnerability Discovery Metrics

Submitted by grigby1 on Mon, 03/09/2020 - 2:38pm

Title	Data-Driven Insights from Vulnerability Discovery Metrics
Publication Type	Conference Paper
Year of Publication	2019
Authors	Munaiah, Nuthan, Meneely, Andrew
Conference Name	2019 IEEE/ACM Joint 4th International Workshop on Rapid Continuous Software Engineering and 1st International Workshop on Data-Driven Decisions, Experimentation and Evolution (RCoSE/DDrEE)
Date Published	May 2019
Publisher	IEEE
ISBN Number	978-1-7281-2247-2
Keywords	application domain, Chromium project, data-driven insights, I-O Systems, i-o systems security, interpretation, metric, natural language feedback, programming language, pubcrawl, security, security metrics, security of data, software metrics, threshold, Vulnerability, vulnerability discovery metrics
Abstract	Software metrics help developers discover and fix mistakes. However, despite promising empirical evidence, vulnerability discovery metrics are seldom relied upon in practice. In prior research, the effectiveness of these metrics has typically been expressed using precision and recall of a prediction model that uses the metrics as explanatory variables. These prediction models, being black boxes, may not be perceived as useful by developers. However, by systematically interpreting the models and metrics, we can provide developers with nuanced insights about factors that have led to security mistakes in the past. In this paper, we present a preliminary approach to using vulnerability discovery metrics to provide insightful feedback to developers as they engineer software. We collected ten metrics (churn, collaboration centrality, complexity, contribution centrality, nesting, known offender, source lines of code, \# inputs, \# outputs, and \# paths) from six open-source projects. We assessed the generalizability of the metrics across two contextual dimensions (application domain and programming language) and between projects within a domain, computed thresholds for the metrics using an unsupervised approach from literature, and assessed the ability of these unsupervised thresholds to classify risk from historical vulnerabilities in the Chromium project. The observations from this study feeds into our ongoing research to automatically aggregate insights from the various analyses to generate natural language feedback on security. We hope that our approach to generate automated feedback will accelerate the adoption of research in vulnerability discovery metrics.
URL	https://ieeexplore.ieee.org/document/8818181
DOI	10.1109/RCoSE/DDrEE.2019.00008
Citation Key	munaiah_data-driven_2019

Groups:

Science of Security VO