Title | Industrial Oriented Evaluation of Fuzzing Techniques |
Publication Type | Conference Paper |
Year of Publication | 2021 |
Authors | Wang, Mingzhe, Liang, Jie, Zhou, Chijin, Chen, Yuanliang, Wu, Zhiyong, Jiang, Yu |
Conference Name | 2021 14th IEEE Conference on Software Testing, Verification and Validation (ICST) |
Date Published | apr |
Keywords | Conferences, Correlation, fuzzing, Fuzzing Optimizations, Industries, Industry Settings, Measurement, Measurement and Metrics Testing, Metrics, Pitfalls and Solutions, pubcrawl, Stability analysis, Synchronization |
Abstract | Fuzzing is a promising method for discovering vulnerabilities. Recently, various techniques are developed to improve the efficiency of fuzzing, and impressive gains are observed in evaluation results. However, evaluation is complex, as many factors affect the results, for example, test suites, baseline and metrics. Even more, most experiment setups are lab-oriented, lacking industrial settings such as large code-base and parallel runs. The correlation between the academic evaluation results and the bug-finding ability in real industrial settings has not been sufficiently studied. In this paper, we test representative fuzzing techniques to reveal their efficiency in industrial settings. First, we apply typical fuzzers on academic widely used small projects from LAVAM suite. We also apply the same fuzzers on large practical projects from Google's fuzzer-test-suite, which is rarely used in academic settings. Both experiments are performed in both single and parallel run. By analyzing the results, we found that most optimizations working well on LAVA-M suite fail to achieve satisfying results on Google's fuzzer-test-suite (e.g. compared to AFL, QSYM detects 82x more synthesized bugs in LAVA-M, but only detects 26% real bugs in Google's fuzzer-test-suite), and the original AFL even outperforms most academic optimization variants in industry widely used parallel runs (e.g. AFL covers 13% more paths than AFLFast). Then, we summarize common pitfalls of those optimizations, analyze the corresponding root causes, and propose potential directions such as orchestrations and synchronization to overcome the problems. For example, when running in parallel on those large practical projects, the proposed horizontal orchestration could cover 36%-82% more paths, and discover 46%-150% more unique crashes or bugs, compared to fuzzers such as AFL, FairFuzz and QSYM. |
DOI | 10.1109/ICST49551.2021.00043 |
Citation Key | wang_industrial_2021 |