Visible to the public Active Learning to Improve Static Analysis

TitleActive Learning to Improve Static Analysis
Publication TypeConference Paper
Year of Publication2019
AuthorsBerman, Maxwell, Adams, Stephen, Sherburne, Tim, Fleming, Cody, Beling, Peter
Conference Name2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)
Keywordsactive learning, composability, Forestry, Human Behavior, Prediction algorithms, Predictive models, pubcrawl, Resiliency, security, static analysis, Tools, Training
AbstractStatic analysis tools are programs that run on source code prior to their compilation to binary executables and attempt to find flaws or defects in the code during the early stages of development. If left unresolved, these flaws could pose security risks. While numerous static analysis tools exist, there is no single tool that is optimal. Therefore, many static analysis tools are often used to analyze code. Further, some of the alerts generated by the static analysis tools are low-priority or false alarms. Machine learning algorithms have been developed to distinguish between true alerts and false alarms, however significant man hours need to be dedicated to labeling data sets for training. This study investigates the use of active learning to reduce the number of labeled alerts needed to adequately train a classifier. The numerical experiments demonstrate that a query by committee active learning algorithm can be utilized to significantly reduce the number of labeled alerts needed to achieve similar performance as a classifier trained on a data set of nearly 60,000 labeled alerts.
DOI10.1109/ICMLA.2019.00215
Citation Keyberman_active_2019