Visible to the public SaTC: CORE: Medium: Collaborative: User-Centered Deployment of Differential PrivacyConflict Detection Enabled

Project Details

Lead PI

Performance Period

Jan 01, 2020 - Dec 31, 2020

Institution(s)

Pennsylvania State University - University Park

Sponsor(s)

National Science Foundation

Award Number


Differential privacy (DP) has been accepted as the de facto standard for data privacy in the research community and beyond. Both companies and government agencies are trying to deploy DP technologies. Broader deployments of DP technology, however, face challenges. This project aims to understand the needs of different stakeholders in data privacy, and to develop algorithms and software to enable broader deployment of private data sharing. The project's novelty is combining the expertise of social science researchers with that of computer scientists who have both theoretical and system research experiences related to DP to develop a hybrid approach to private data sharing to achieve better privacy-utility tradeoff. The project's impacts are in advancing the state-of-the-art with regard to DP deployment in particular and privacy protection in general. More specifically the project identifies the workflow of DP data sharing, improve understanding of DP communication, and develop new algorithms, privacy concepts, and privacy mechanisms to support deployment of DP.

The project has four tasks that will advance the understanding of user-centered DP and lay a foundation for its deployment. (1) Examine individual human users' perception, comprehension and acceptance of the concept and guarantee of DP and the effect of privacy parameter, and to investigate effective ways to communicate those concepts. (2) Implement methods from the domains of human factors and human-computer interaction to identify tasks, goals, and workflow in private data sharing. (3) Develop key algorithms and software for a hybrid approach of private data sharing. In the hybrid approach, one first publishes a private synopsis of dataset using carefully selected low-degree marginals. From these marginals, one can either synthesize new datasets, or answer queries directly using inference under the maximum entropy principle. The hybrid approach enhances this with interactive query answering, enabling extraction of information not covered by low-degree marginals. (4) Develop techniques to further improve the privacy-utility tradeoff in private data sharing, including a theory of differential privacy under publishable information.