Differentially Private K-Means Clustering
Title | Differentially Private K-Means Clustering |
Publication Type | Conference Paper |
Year of Publication | 2016 |
Authors | Su, Dong, Cao, Jianneng, Li, Ninghui, Bertino, Elisa, Jin, Hongxia |
Conference Name | Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 978-1-4503-3935-3 |
Keywords | control theory, Differential privacy, k-means clustering, privacy, private data publishing, pubcrawl, Resiliency |
Abstract | There are two broad approaches for differentially private data analysis. The interactive approach aims at developing customized differentially private algorithms for various data mining tasks. The non-interactive approach aims at developing differentially private algorithms that can output a synopsis of the input dataset, which can then be used to support various data mining tasks. In this paper we study the effectiveness of the two approaches on differentially private k-means clustering. We develop techniques to analyze the empirical error behaviors of the existing interactive and non-interactive approaches. Based on the analysis, we propose an improvement of DPLloyd which is a differentially private version of the Lloyd algorithm. We also propose a non-interactive approach EUGkM which publishes a differentially private synopsis for k-means clustering. Results from extensive and systematic experiments support our analysis and demonstrate the effectiveness of our improvement on DPLloyd and the proposed EUGkM algorithm. |
URL | http://doi.acm.org/10.1145/2857705.2857708 |
DOI | 10.1145/2857705.2857708 |
Citation Key | su_differentially_2016 |