A Non-Parametric Model for Accurate and Provably Private Synthetic Data Sets
Title | A Non-Parametric Model for Accurate and Provably Private Synthetic Data Sets |
Publication Type | Conference Paper |
Year of Publication | 2017 |
Authors | Soria-Comas, Jordi, Domingo-Ferrer, Josep |
Conference Name | Proceedings of the 12th International Conference on Availability, Reliability and Security |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 978-1-4503-5257-4 |
Keywords | formal privacy, Measurement, Metrics, non-parametric methods, privacy, privacy models, pubcrawl, Synthetic Data, ε-synthetic privacy |
Abstract | Generating synthetic data is a well-known option to limit disclosure risk in sensitive data releases. The usual approach is to build a model for the population and then generate a synthetic data set solely based on the model. We argue that building an accurate population model is difficult and we propose instead to approximate the original data as closely as privacy constraints permit. To enforce an ex ante privacy level when generating synthetic data, we introduce a new privacy model called $e$ synthetic privacy. Then, we describe a synthetic data generation method that satisfies $e$-synthetic privacy. Finally, we evaluate the utility of the synthetic data generated with our method. |
URL | https://dl.acm.org/citation.cfm?doid=3098954.3098962 |
DOI | 10.1145/3098954.3098962 |
Citation Key | soria-comas_non-parametric_2017 |