Improving Fuzzing through Controlled Compilation

Submitted by aekwall on Mon, 03/15/2021 - 12:09pm

Title	Improving Fuzzing through Controlled Compilation
Publication Type	Conference Paper
Year of Publication	2020
Authors	Simon, L., Verma, A.
Conference Name	2020 IEEE European Symposium on Security and Privacy (EuroS P)
Date Published	sep
Keywords	afl, AFL's configuration, beneficial compiler optimizations, compiler security, compositionality, concolic fuzzers, controlled compilation, coverage, coverage mesaures, coverage metrics, current coverage-based evaluation measures, fuzzing, fuzzing consistency, fuzzing strategy, fuzzy set theory, grey-box fuzzers, Intermediate Representation, LLVM, Metrics, open source projects, program compilers, program line, program testing, pubcrawl, qualitative coverage, Resiliency, rigorous evaluation methodology, Scalability, security of data, source code, standard compilers
Abstract	We observe that operations performed by standard compilers harm fuzzing because the optimizations and the Intermediate Representation (IR) lead to transformations that improve execution speed at the expense of fuzzing. To remedy this problem, we propose `controlled compilation', a set of techniques to automatically re-factor a program's source code and cherry pick beneficial compiler optimizations to improve fuzzing. We design, implement and evaluate controlled compilation by building a new toolchain with Clang/LLVM. We perform an evaluation on 10 open source projects and compare the results of AFL to state-of-the-art grey-box fuzzers and concolic fuzzers. We show that when programs are compiled with this new toolchain, AFL covers 30 % new code on average and finds 21 additional bugs in real world programs. Our study reveals that controlled compilation often covers more code and finds more bugs than state-of-the-art fuzzing techniques, without the need to write a fuzzer from scratch or resort to advanced techniques. We identify two main reasons to explain why. First, it has proven difficult for researchers to appropriately configure existing fuzzers such as AFL. To address this problem, we provide guidelines and new LLVM passes to help automate AFL's configuration. This will enable researchers to perform a fairer comparison with AFL. Second, we find that current coverage-based evaluation measures (e.g. the total number of visited lines, edges or BBs) are inadequate because they lose valuable information such as which parts of a program a fuzzer actually visits and how consistently it does so. Coverage is considered a useful metric to evaluate a fuzzer's performance and devise a fuzzing strategy. However, the lack of a standard methodology for evaluating coverage remains a problem. To address this, we propose a rigorous evaluation methodology based on `qualitative coverage'. Qualitative coverage uniquely identifies each program line to help understand which lines are commonly visited by different fuzzers vs. which lines are visited only by a particular fuzzer. Throughout our study, we show the benefits of this new evaluation methodology. For example we provide valuable insights into the consistency of fuzzers, i.e. their ability to cover the same code or find the same bug across multiple independent runs. Overall, our evaluation methodology based on qualitative coverage helps to understand if a fuzzer performs better, worse, or is complementary to another fuzzer. This helps security practitioners adjust their fuzzing strategies.
DOI	10.1109/EuroSP48549.2020.00011
Citation Key	simon_improving_2020

Groups:

Science of Security VO