Visible to the public Biblio

Filters: Author is Mao, Bing  [Clear All Filters]
2023-04-28
Chen, Ligeng, He, Zhongling, Wu, Hao, Xu, Fengyuan, Qian, Yi, Mao, Bing.  2022.  DIComP: Lightweight Data-Driven Inference of Binary Compiler Provenance with High Accuracy. 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). :112–122.
Binary analysis is pervasively utilized to assess software security and test vulnerabilities without accessing source codes. The analysis validity is heavily influenced by the inferring ability of information related to the code compilation. Among the compilation information, compiler type and optimization level, as the key factors determining how binaries look like, are still difficult to be inferred efficiently with existing tools. In this paper, we conduct a thorough empirical study on the binary's appearance under various compilation settings and propose a lightweight binary analysis tool based on the simplest machine learning method, called DIComP to infer the compiler and optimization level via most relevant features according to the observation. Our comprehensive evaluations demonstrate that DIComP can fully recognize the compiler provenance, and it is effective in inferring the optimization levels with up to 90% accuracy. Also, it is efficient to infer thousands of binaries at a millisecond level with our lightweight machine learning model (1MB).
2019-09-26
Pang, Chengbin, Du, Yunlan, Mao, Bing, Guo, Shanqing.  2018.  Mapping to Bits: Efficiently Detecting Type Confusion Errors. Proceedings of the 34th Annual Computer Security Applications Conference. :518-528.

The features of modularity and inheritance in C++ facilitate the developers' usage, but also give rise to the problem of type confusion. As an ancestor class may have a different data layout from its descendant class, a dangerous downcasting operation from the ancestor to its descendant can lead to a critical attack, such as control flow hijacking, out-of-bounds access to neighbor memory area, etc. As reported in CVE, such vulnerabilities have been found in various common-used software, including Google Chrome, Firefox and Adobe Flash Player, and have a trend of increase in recent years. The urgency of addressing type confusion problems quickens the pace of researchers coming to corresponding solutions. However, the existing works either handle the problem partially, or suffer from the high performance and memory overhead, especially to the large-scale projects. We present Bitype to check the validity explicitly when a type is downcasting to another, maintaining high coverage and reducing overhead and compilation time massively. The core of our design is a Safe Encoding Scheme, which encodes all of the classes by mapping them to bits. With this scheme, Bitype treats the classes and their safe convertible classes as codes and verifies typecastings in an xor operation, both decreasing the performance overhead of check and the memory overhead. Besides, we implement a Clang Tool to avoid the repeated collection of inheritance relationships and deploy a two-level lookup table to trace objects efficiently. Evaluated on SPEC CPU2006 benchmarks and Firefox browser, Bitype shows a slightly higher coverage of typecasting compared to the state-of-the-art HexType[22], but reduces the performance overhead by 2 to 16 times, the memory overhead by 2 to 3 times, the compilation time by 21 to 223 times. As a result, our solution is a practical and efficient typecasting checker for commodity software.