Visible to the public Optimized Data de-Identification Using Multidimensional k-Anonymity

TitleOptimized Data de-Identification Using Multidimensional k-Anonymity
Publication TypeConference Paper
Year of Publication2018
AuthorsLiu, Kai-Cheng, Kuo, Chuan-Wei, Liao, Wen-Chiuan, Wang, Pang-Chieh
Conference Name2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE)
Keywordsanonymization-based de-identification method, Big Data, big data analysis, Big Data analytics, Data analysis, data authenticity preservation, Data models, data privacy, data protection, data quality, de-identification, globalized knowledge economy, greedy algorithms, information loss, information loss rate, k-anonymity, MDR tree method, mean difference recommending tree method, Measurement, Metrics, multidimensional k-anonymity, Numerical models, optimized multidimensional method, personal information, personal privacy, priority weight-adjusted method, privacy, privacy leak risk, privacy models and measurement, privacy preserving, privacy protection mechanism, pubcrawl
AbstractIn the globalized knowledge economy, big data analytics have been widely applied in diverse areas. A critical issue in big data analysis on personal information is the possible leak of personal privacy. Therefore, it is necessary to have an anonymization-based de-identification method to avoid undesirable privacy leak. Such method can prevent published data form being traced back to personal privacy. Prior empirical researches have provided approaches to reduce privacy leak risk, e.g. Maximum Distance to Average Vector (MDAV), Condensation Approach and Differential Privacy. However, previous methods inevitably generate synthetic data of different sizes and is thus unsuitable for general use. To satisfy the need of general use, k-anonymity can be chosen as a privacy protection mechanism in the de-identification process to ensure the data not to be distorted, because k-anonymity is strong in both protecting privacy and preserving data authenticity. Accordingly, this study proposes an optimized multidimensional method for anonymizing data based on both the priority weight-adjusted method and the mean difference recommending tree method (MDR tree method). The results of this study reveal that this new method generate more reliable anonymous data and reduce the information loss rate.
DOI10.1109/TrustCom/BigDataSE.2018.00235
Citation Keyliu_optimized_2018