Title | LEMNA: Explaining Deep Learning Based Security Applications |
Publication Type | Conference Paper |
Year of Publication | 2018 |
Authors | Guo, Wenbo, Mu, Dongliang, Xu, Jun, Su, Purui, Wang, Gang, Xing, Xinyu |
Conference Name | Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 978-1-4503-5693-0 |
Keywords | Binary Analysis, controller area network security, Cyber-physical systems, deep recurrent neural networks, explainable AI, Internet of Things, pubcrawl, resilience |
Abstract | While deep learning has shown a great potential in various domains, the lack of transparency has limited its application in security or safety-critical areas. Existing research has attempted to develop explanation techniques to provide interpretable explanations for each classification decision. Unfortunately, current methods are optimized for non-security tasks ( e.g., image analysis). Their key assumptions are often violated in security applications, leading to a poor explanation fidelity. In this paper, we propose LEMNA, a high-fidelity explanation method dedicated for security applications. Given an input data sample, LEMNA generates a small set of interpretable features to explain how the input sample is classified. The core idea is to approximate a local area of the complex deep learning decision boundary using a simple interpretable model. The local interpretable model is specially designed to (1) handle feature dependency to better work with security applications ( e.g., binary code analysis); and (2) handle nonlinear local boundaries to boost explanation fidelity. We evaluate our system using two popular deep learning applications in security (a malware classifier, and a function start detector for binary reverse-engineering). Extensive evaluations show that LEMNA's explanation has a much higher fidelity level compared to existing methods. In addition, we demonstrate practical use cases of LEMNA to help machine learning developers to validate model behavior, troubleshoot classification errors, and automatically patch the errors of the target models. |
URL | http://doi.acm.org/10.1145/3243734.3243792 |
DOI | 10.1145/3243734.3243792 |
Citation Key | guo_lemna:_2018 |