Visible to the public EAGER: Tracing Privacy-Policy Statements into Code for Privacy-Aware Mobile App DevelopmenConflict Detection Enabled

Project Details

Lead PI

Co-PIs

Performance Period

Aug 15, 2017 - Jul 31, 2018

Institution(s)

University of Texas at San Antonio

Award Number


Privacy for smartphone and mobile applications users present unprecedented threats. In the United States, privacy policies serve as the primary means to inform users about how mobile apps process privacy data. The application developers are responsible for implementing privacy policies so that the code corresponds to the policies. Currently, there are no techniques for tracing high-level privacy practices into code. New research is needed to develop automatic privacy-aware development tools, as well as tools to determine if the code implements the policies correctly. To automatically trace high-level privacy practices in privacy policies into application code, the project borrows the idea of context-based classification from information extraction in natural language processing (NLP). In particular, both the natural-language-based policies and the program code are subjected to statistical NLP techniques to determine relationships between the policies and their implementations. The assumption that NLP techniques can be applied to code is based on recent work that has established the "naturalness" of software, in the sense that statistical NLP techniques appear to work just as well for computer programs as they do for, say, English. The project will mine a large set of privacy policies and corresponding code to find out how, or to what extent, policies manifest themselves in code. To the extent that consistency between privacy policies and code can be determined, new approaches to privacy policy understanding and enforcement might be possible.