Differential privacy-based data de-identification protection and risk evaluation system

Submitted by grigby1 on Fri, 09/28/2018 - 11:58am

Title	Differential privacy-based data de-identification protection and risk evaluation system
Publication Type	Conference Paper
Year of Publication	2017
Authors	Tsou, Y., Chen, H., Chen, J., Huang, Y., Wang, P.
Conference Name	2017 International Conference on Information and Communication Technology Convergence (ICTC)
Date Published	oct
ISBN Number	978-1-5090-4032-2
Keywords	Big Data, composability, data de-identification process, data disclosure estimation system, data mining, data privacy, data protection, data query, de-identification, Differential privacy, differential privacy-based data de-identification protection, Human Behavior, native differential privacy, privacy protection issues, privacy-sensitive data, pubcrawl, query processing, Resiliency, risk evaluation system, Scalability, Synthetic Dataset, value added analysis
Abstract	As more and more technologies to store and analyze massive amount of data become available, it is extremely important to make privacy-sensitive data de-identified so that further analysis can be conducted by different parties. For example, data needs to go through data de-identification process before being transferred to institutes for further value added analysis. As such, privacy protection issues associated with the release of data and data mining have become a popular field of study in the domain of big data. As a strict and verifiable definition of privacy, differential privacy has attracted noteworthy attention and widespread research in recent years. Nevertheless, differential privacy is not practical for most applications due to its performance of synthetic dataset generation for data query. Moreover, the definition of data protection by randomized noise in native differential privacy is abstract to users. Therefore, we design a pragmatic DP-based data de-identification protection and risk of data disclosure estimation system, in which a DP-based noise addition mechanism is applied to generate synthetic datasets. Furthermore, the risk of data disclosure to these synthetic datasets can be evaluated before releasing to buyers/consumers.
URL	https://ieeexplore.ieee.org/document/8191015
DOI	10.1109/ICTC.2017.8191015
Citation Key	tsou_differential_2017

Groups:

Science of Security VO