Biblio
The Android research community has long focused on building an Android API permission specification, which can be leveraged by app developers to determine the optimum set of permissions necessary for a correct and safe execution of their app. However, while prominent existing efforts provide a good approximation of the permission specification, they suffer from a few shortcomings. Dynamic approaches cannot generate complete results, although accurate for the particular execution. In contrast, static approaches provide better coverage, but produce imprecise mappings due to their lack of path-sensitivity. In fact, in light of Android's access control complexity, the approximations hardly abstract the actual co-relations between enforced protections. To address this, we propose to precisely derive Android protection specification in a path-sensitive fashion, using a novel graph abstraction technique. We further showcase how we can apply the generated maps to tackle security issues through logical satisfiability reasoning. Our constructed maps for 4 Android Open Source Project (AOSP) images highlight the significance of our approach, as \textasciitilde41% of APIs' protections cannot be correctly modeled without our technique.
Modern software systems are becoming increasingly complex, relying on a lot of third-party library support. Library behaviors are hence an integral part of software behaviors. Analyzing them is as important as analyzing the software itself. However, analyzing libraries is highly challenging due to the lack of source code, implementation in different languages, and complex optimizations. We observe that many Java library functions provide excellent documentation, which concisely describes the functionalities of the functions. We develop a novel technique that can construct models for Java API functions by analyzing the documentation. These models are simpler implementations in Java compared to the original ones and hence easier to analyze. More importantly, they provide the same functionalities as the original functions. Our technique successfully models 326 functions from 14 widely used Java classes. We also use these models in static taint analysis on Android apps and dynamic slicing for Java programs, demonstrating the effectiveness and efficiency of our models.
Traditional sensitive data disclosure analysis faces two challenges: to identify sensitive data that is not generated by specific API calls, and to report the potential disclosures when the disclosed data is recognized as sensitive only after the sink operations. We address these issues by developing BidText, a novel static technique to detect sensitive data disclosures. BidText formulates the problem as a type system, in which variables are typed with the text labels that they encounter (e.g., during key-value pair operations). The type system features a novel bi-directional propagation technique that propagates the variable label sets through forward and backward data-flow. A data disclosure is reported if a parameter at a sink point is typed with a sensitive text label. BidText is evaluated on 10,000 Android apps. It reports 4,406 apps that have sensitive data disclosures, with 4,263 apps having log based disclosures and 1,688 having disclosures due to other sinks such as HTTP requests. Existing techniques can only report 64.0% of what BidText reports. And manual inspection shows that the false positive rate for BidText is 10%.