Title | Optimal Differentially Private Algorithms for k-Means Clustering |
Publication Type | Conference Paper |
Year of Publication | 2018 |
Authors | Huang, Zhiyi, Liu, Jinyan |
Conference Name | Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 978-1-4503-4706-8 |
Keywords | composability, Differential privacy, k-means clustering, pubcrawl, Resiliency, Scalability, well separation |
Abstract | We consider privacy-preserving k-means clustering. For the objective of minimizing the Wasserstein distance between the output and the optimal solution, we show that there is a polynomial-time (e,d)-differentially private algorithm which, for any sufficiently large F2 well-separated datasets, outputs k centers that are within Wasserstein distance O(F2) from the optimal. This result improves the previous bounds by removing the dependence on e, number of centers k, and dimension d. Further, we prove a matching lower bound that no (e, d)-differentially private algorithm can guarantee Wasserstein distance less than Omega (F2) and, thus, our positive result is optimal up to a constant factor. For minimizing the k-means objective when the dimension d is bounded, we propose a polynomial-time private local search algorithm that outputs an an-additive approximation when the size of the dataset is at least \textbackslashtextasciitildeO (k3/2 * d * e-1 * poly(a-1)). |
URL | http://doi.acm.org/10.1145/3196959.3196977 |
DOI | 10.1145/3196959.3196977 |
Citation Key | huang_optimal_2018 |