Visible to the public How to Build Static Checking Systems Using Orders of Magnitude Less Code

TitleHow to Build Static Checking Systems Using Orders of Magnitude Less Code
Publication TypeConference Paper
Year of Publication2016
AuthorsBrown, Fraser, Nötzli, Andres, Engler, Dawson
Conference NameProceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems
PublisherACM
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-4091-5
Keywordsbug finding, composability, Human Behavior, micro-grammars, parsing, pubcrawl, Resiliency, static analysis, static code analysis
Abstract

Modern static bug finding tools are complex. They typically consist of hundreds of thousands of lines of code, and most of them are wedded to one language (or even one compiler). This complexity makes the systems hard to understand, hard to debug, and hard to retarget to new languages, thereby dramatically limiting their scope. This paper reduces checking system complexity by addressing a fundamental assumption, the assumption that checkers must depend on a full-blown language specification and compiler front end. Instead, our program checkers are based on drastically incomplete language grammars ("micro-grammars") that describe only portions of a language relevant to a checker. As a result, our implementation is tiny-roughly 2500 lines of code, about two orders of magnitude smaller than a typical system. We hope that this dramatic increase in simplicity will allow people to use more checkers on more systems in more languages. We implement our approach in mchex, a language-agnostic framework for writing static bug checkers. We use it to build micro-grammar based checkers for six languages (C, the C preprocessor, C++, Java, JavaScript, and Dart) and find over 700 errors in real-world projects.

URLhttp://doi.acm.org/10.1145/2872362.2872364
DOI10.1145/2872362.2872364
Citation Keybrown_how_2016