Visible to the public Biblio

Filters: Author is Dou, Wanchun  [Clear All Filters]
2018-01-10
Wu, Xiaotong, Dou, Wanchun, Ni, Qiang.  2017.  Game Theory Based Privacy Preserving Analysis in Correlated Data Publication. Proceedings of the Australasian Computer Science Week Multiconference. :73:1–73:10.

Privacy preserving on data publication has been an important research field over the past few decades. One of the fundamental challenges in privacy preserving data publication is the trade-off problem between privacy and utility of the single and independent data set. However, recent research works have shown that the advanced privacy mechanism, i.e., differential privacy, is vulnerable when multiple data sets are correlated. In this case, the trade-off problem between privacy and utility is evolved into a game problem, in which the payoff of each player is dependent not only on his privacy parameter, but also on his neighbors' privacy parameters. In this paper, we firstly present the definition of correlated differential privacy to evaluate the real privacy level of a single data set influenced by the other data sets. Then, we construct a game model of multiple players, who each publishes the data set sanitized by differential privacy. Next, we analyze the existence and uniqueness of the pure Nash Equilibrium and demonstrate the sufficient conditions in the game. Finally, we refer to a notion, i.e., the price of anarchy, to evaluate efficiency of the pure Nash Equilibrium.

2017-04-24
Zhang, Xuyun, Leckie, Christopher, Dou, Wanchun, Chen, Jinjun, Kotagiri, Ramamohanarao, Salcic, Zoran.  2016.  Scalable Local-Recoding Anonymization Using Locality Sensitive Hashing for Big Data Privacy Preservation. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. :1793–1802.

While cloud computing has become an attractive platform for supporting data intensive applications, a major obstacle to the adoption of cloud computing in sectors such as health and defense is the privacy risk associated with releasing datasets to third-parties in the cloud for analysis. A widely-adopted technique for data privacy preservation is to anonymize data via local recoding. However, most existing local-recoding techniques are either serial or distributed without directly optimizing scalability, thus rendering them unsuitable for big data applications. In this paper, we propose a highly scalable approach to local-recoding anonymization in cloud computing, based on Locality Sensitive Hashing (LSH). Specifically, a novel semantic distance metric is presented for use with LSH to measure the similarity between two data records. Then, LSH with the MinHash function family can be employed to divide datasets into multiple partitions for use with MapReduce to parallelize computation while preserving similarity. By using our efficient LSH-based scheme, we can anonymize each partition through the use of a recursive agglomerative \$k\$-member clustering algorithm. Extensive experiments on real-life datasets show that our approach significantly improves the scalability and time-efficiency of local-recoding anonymization by orders of magnitude over existing approaches.