Title | APDPk-Means: A New Differential Privacy Clustering Algorithm Based on Arithmetic Progression Privacy Budget Allocation |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Fan, Zexuan, Xu, Xiaolong |
Conference Name | 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) |
Date Published | aug |
Keywords | APDPk-means, arithmetic progression privacy budget allocation, Big Data, Clustering algorithms, composability, data clustering, data mining, data protection, Differential privacy, differential privacy clustering algorithm, differential privacy k-means, Human Behavior, Iterative methods, iterative process, k-means algorithm, network data mining, network information security, optimisation, pattern clustering, privacy, privacy budget allocation, privacy budgets, privacy protection, pubcrawl, Resiliency, Resource management, Scalability, security of data, Sensitivity |
Abstract | How to protect users' private data during network data mining has become a hot issue in the fields of big data and network information security. Most current researches on differential privacy k-means clustering algorithms focus on optimizing the selection of initial centroids. However, the traditional privacy budget allocation has the problem that the random noise becomes too large as the number of iterations increases, which will reduce the performance of data clustering. To solve the problem, we improved the way of privacy budget allocation in differentially private clustering algorithm DPk-means, and proposed APDPk-means, a new differential privacy clustering algorithm based on arithmetic progression privacy budget allocation. APDPk-means decomposes the total privacy budget into a decreasing arithmetic progression, allocating the privacy budgets from large to small in the iterative process, so as to ensure the rapid convergence in early iteration. The experiment results show that compared with the other differentially private k-means algorithms, APDPk-means has better performance in availability and quality of the clustering result under the same level of privacy protection. |
DOI | 10.1109/HPCC/SmartCity/DSS.2019.00238 |
Citation Key | fan_apdpk-means_2019 |