Biblio
The Android application market will conduct various security analysis on each application to predict its potential harm before put it online. Since almost all the static analysis tools can only detect malicious behaviors in the Java layer, more and more malwares try to avoid static analysis by taking the malicious codes to the Native layer. To provide a solution for the above situation, there's a new research aspect proposed in this paper and defined as Inter-language Static Analysis. As all the involved technologies are introduced, the current research results of them will be captured in this paper, such as static analysis in Java layer, binary analysis in Native layer, Java-Native penetration technology, etc.
We present a novel method for static analysis in which we combine data-flow analysis with machine learning to detect SQL injection (SQLi) and Cross-Site Scripting (XSS) vulnerabilities in PHP applications. We assembled a dataset from the National Vulnerability Database and the SAMATE project, containing vulnerable PHP code samples and their patched versions in which the vulnerability is solved. We extracted features from the code samples by applying data-flow analysis techniques, including reaching definitions analysis, taint analysis, and reaching constants analysis. We used these features in machine learning to train various probabilistic classifiers. To demonstrate the effectiveness of our approach, we built a tool called WIRECAML, and compared our tool to other tools for vulnerability detection in PHP code. Our tool performed best for detecting both SQLi and XSS vulnerabilities. We also tried our approach on a number of open-source software applications, and found a previously unknown vulnerability in a photo-sharing web application.