Title | HashMTI: Scalable Mutation-based Taint Inference with Hash Records |
Publication Type | Conference Paper |
Year of Publication | 2021 |
Authors | Kong, Xiangdong, Tang, Yong, Wang, Pengfei, Wei, Shuning, Yue, Tai |
Conference Name | 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) |
Keywords | composability, Conferences, fuzzing, Hash functions, Memory management, Metrics, program analysis, Prototypes, pubcrawl, Scalability, Software, Software Testing, taint analysis |
Abstract | Mutation-based taint inference (MTI) is a novel technique for taint analysis. Compared with traditional techniques that track propagations of taint tags, MTI infers a variable is tainted if its values change due to input mutations, which is lightweight and conceptually sound. However, there are 3 challenges to its efficiency and scalability: (1) it cannot efficiently record variable values to monitor their changes; (2) it consumes a large amount of memory monitoring variable values, especially on complex programs; and (3) its excessive memory overhead leads to a low hit ratio of CPU cache, which slows down the speed of taint inference. This paper presents an efficient and scalable solution named HashMTI. We first explain the above challenges based on 4 observations. Motivated by these challenges, we propose a hash record scheme to efficiently monitor changes in variable values and significantly reduce the memory overhead. The scheme is based on our specially selected and optimized hash functions that possess 3 crucial properties. Moreover, we propose the DoubleMutation strategy, which applies additional mutations to mitigate the limitation of the hash record and detect more taint information. We implemented a prototype of HashMTI and evaluated it on 18 real-world programs and 4 LAVA-M programs. Compared with the baseline OrigMTI, HashMTI significantly reduces the overhead while having similar accuracy. It achieves a speedup of 2.5X to 23.5X and consumes little memory which is on average 70.4 times less than that of OrigMTI. |
DOI | 10.1109/SANER50967.2021.00017 |
Citation Key | kong_hashmti_2021 |