Visible to the public Data-Driven Insights from Vulnerability Discovery Metrics

TitleData-Driven Insights from Vulnerability Discovery Metrics
Publication TypeConference Paper
Year of Publication2019
AuthorsMunaiah, Nuthan, Meneely, Andrew
Conference Name2019 IEEE/ACM Joint 4th International Workshop on Rapid Continuous Software Engineering and 1st International Workshop on Data-Driven Decisions, Experimentation and Evolution (RCoSE/DDrEE)
Date PublishedMay 2019
PublisherIEEE
ISBN Number978-1-7281-2247-2
Keywordsapplication domain, Chromium project, data-driven insights, I-O Systems, i-o systems security, interpretation, metric, natural language feedback, programming language, pubcrawl, security, security metrics, security of data, software metrics, threshold, Vulnerability, vulnerability discovery metrics
Abstract

Software metrics help developers discover and fix mistakes. However, despite promising empirical evidence, vulnerability discovery metrics are seldom relied upon in practice. In prior research, the effectiveness of these metrics has typically been expressed using precision and recall of a prediction model that uses the metrics as explanatory variables. These prediction models, being black boxes, may not be perceived as useful by developers. However, by systematically interpreting the models and metrics, we can provide developers with nuanced insights about factors that have led to security mistakes in the past. In this paper, we present a preliminary approach to using vulnerability discovery metrics to provide insightful feedback to developers as they engineer software. We collected ten metrics (churn, collaboration centrality, complexity, contribution centrality, nesting, known offender, source lines of code, \# inputs, \# outputs, and \# paths) from six open-source projects. We assessed the generalizability of the metrics across two contextual dimensions (application domain and programming language) and between projects within a domain, computed thresholds for the metrics using an unsupervised approach from literature, and assessed the ability of these unsupervised thresholds to classify risk from historical vulnerabilities in the Chromium project. The observations from this study feeds into our ongoing research to automatically aggregate insights from the various analyses to generate natural language feedback on security. We hope that our approach to generate automated feedback will accelerate the adoption of research in vulnerability discovery metrics.

URLhttps://ieeexplore.ieee.org/document/8818181
DOI10.1109/RCoSE/DDrEE.2019.00008
Citation Keymunaiah_data-driven_2019