Visible to the public Optimal Differentially Private Algorithms for k-Means Clustering

TitleOptimal Differentially Private Algorithms for k-Means Clustering
Publication TypeConference Paper
Year of Publication2018
AuthorsHuang, Zhiyi, Liu, Jinyan
Conference NameProceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
PublisherACM
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-4706-8
Keywordscomposability, Differential privacy, k-means clustering, pubcrawl, Resiliency, Scalability, well separation
AbstractWe consider privacy-preserving k-means clustering. For the objective of minimizing the Wasserstein distance between the output and the optimal solution, we show that there is a polynomial-time (e,d)-differentially private algorithm which, for any sufficiently large F2 well-separated datasets, outputs k centers that are within Wasserstein distance O(F2) from the optimal. This result improves the previous bounds by removing the dependence on e, number of centers k, and dimension d. Further, we prove a matching lower bound that no (e, d)-differentially private algorithm can guarantee Wasserstein distance less than Omega (F2) and, thus, our positive result is optimal up to a constant factor. For minimizing the k-means objective when the dimension d is bounded, we propose a polynomial-time private local search algorithm that outputs an an-additive approximation when the size of the dataset is at least \textbackslashtextasciitildeO (k3/2 * d * e-1 * poly(a-1)).
URLhttp://doi.acm.org/10.1145/3196959.3196977
DOI10.1145/3196959.3196977
Citation Keyhuang_optimal_2018