Visible to the public TIFF: Using Input Type Inference To Improve Fuzzing

TitleTIFF: Using Input Type Inference To Improve Fuzzing
Publication TypeConference Paper
Year of Publication2018
AuthorsJain, Vivek, Rawat, Sanjay, Giuffrida, Cristiano, Bos, Herbert
Conference NameProceedings of the 34th Annual Computer Security Applications Conference
PublisherACM
ISBN Number978-1-4503-6569-7
Keywordscomposability, data-structure Identification, fuzzing, Metrics, pubcrawl, security, taint analysis, Taint-flow analysis, type inference, vulnerability/bug detection
Abstract

Developers commonly use fuzzing techniques to hunt down all manner of memory corruption vulnerabilities during the testing phase. Irrespective of the fuzzer, input mutation plays a central role in providing adequate code coverage, as well as in triggering bugs. However, each class of memory corruption bugs requires a different trigger condition. While the goal of a fuzzer is to find bugs, most existing fuzzers merely approximate this goal by targeting their mutation strategies toward maximizing code coverage. In this work, we present a new mutation strategy that maximizes the likelihood of triggering memory-corruption bugs by generating fewer, but better inputs. In particular, our strategy achieves bug-directed mutation by inferring the type of the input bytes. To do so, it tags each offset of the input with a basic type (e.g., 32-bit integer, string, array etc.), while deriving mutation rules for specific classes of bugs. We infer types by means of in-memory data-structure identification and dynamic taint analysis, and implement our novel mutation strategy in a fully functional fuzzer which we call TIFF (Type Inference-based Fuzzing Framework). Our evaluation on real-world applications shows that type-based fuzzing triggers bugs much earlier than existing solutions, while maintaining high code coverage. For example, on several real-world applications and libraries (e.g., poppler, mpg123 etc.), we find real bugs (with known CVEs) in almost half of the time and upto an order of magnitude fewer inputs than state-of-the-art fuzzers.

URLhttps://dl.acm.org/doi/10.1145/3274694.3274746
DOI10.1145/3274694.3274746
Citation Keyjain_tiff:_2018