Title | Identifying Ubiquitious Third-Party Libraries in Compiled Executables Using Annotated and Translated Disassembled Code with Supervised Machine Learning |
Publication Type | Conference Paper |
Year of Publication | 2020 |
Authors | Haile, J., Havens, S. |
Conference Name | 2020 IEEE Security and Privacy Workshops (SPW) |
Date Published | may |
Keywords | Bayes method, Classification algorithms, clustering methods, Databases, graph theory, Internet, k-nearest neighbor search, Libraries, machine learning, Matrices, Measurement, Microprogramming, nearest neighbor search, Neural Network, Predictive Metrics, pubcrawl, reverse engineering, Software, supervised learning, supply chain management, Support vector machines, Task Analysis, Tools, vector |
Abstract | The size and complexity of the software ecosystem is a major challenge for vendors, asset owners and cybersecurity professionals who need to understand the security posture of these systems. Annotated and Translated Disassembled Code is a graph based datastore designed to organize firmware and software analysis data across builds, packages and systems, providing a highly scalable platform enabling automated binary software analysis tasks including corpora construction and storage for machine learning. This paper describes an approach for the identification of ubiquitous third-party libraries in firmware and software using Annotated and Translated Disassembled Code and supervised machine learning. Annotated and Translated Disassembled Code provide matched libraries, function names and addresses of previously unidentified code in software as it is being automatically analyzed. This data can be ingested by other software analysis tools to improve accuracy and save time. Defenders can add the identified libraries to their vulnerability searches and add effective detection and mitigation into their operating environment. |
DOI | 10.1109/SPW50608.2020.00042 |
Citation Key | haile_identifying_2020 |