Visible to the public Biblio

Filters: Author is Zhang, Xuyun  [Clear All Filters]
2021-08-31
Hu, Hongsheng, Dobbie, Gillian, Salcic, Zoran, Liu, Meng, Zhang, Jianbing, Zhang, Xuyun.  2020.  A Locality Sensitive Hashing Based Approach for Federated Recommender System. 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID). :836–842.
The recommender system is an important application in big data analytics because accurate recommendation items or high-valued suggestions can bring high profit to both commercial companies and customers. To make precise recommendations, a recommender system often needs large and fine-grained data for training. In the current big data era, data often exist in the form of isolated islands, and it is difficult to integrate the data scattered due to privacy security concerns. Moreover, privacy laws and regulations make it harder to share data. Therefore, designing a privacy-preserving recommender system is of paramount importance. Existing privacy-preserving recommender system models mainly adapt cryptography approaches to achieve privacy preservation. However, cryptography approaches have heavy overhead when performing encryption and decryption operations and they lack a good level of flexibility. In this paper, we propose a Locality Sensitive Hashing (LSH) based approach for federated recommender system. Our proposed efficient and scalable federated recommender system can make full use of multiple source data from different data owners while guaranteeing preservation of privacy of contributing parties. Extensive experiments on real-world benchmark datasets show that our approach can achieve both high time efficiency and accuracy under small privacy budgets.
2017-04-24
Zhang, Xuyun, Leckie, Christopher, Dou, Wanchun, Chen, Jinjun, Kotagiri, Ramamohanarao, Salcic, Zoran.  2016.  Scalable Local-Recoding Anonymization Using Locality Sensitive Hashing for Big Data Privacy Preservation. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. :1793–1802.

While cloud computing has become an attractive platform for supporting data intensive applications, a major obstacle to the adoption of cloud computing in sectors such as health and defense is the privacy risk associated with releasing datasets to third-parties in the cloud for analysis. A widely-adopted technique for data privacy preservation is to anonymize data via local recoding. However, most existing local-recoding techniques are either serial or distributed without directly optimizing scalability, thus rendering them unsuitable for big data applications. In this paper, we propose a highly scalable approach to local-recoding anonymization in cloud computing, based on Locality Sensitive Hashing (LSH). Specifically, a novel semantic distance metric is presented for use with LSH to measure the similarity between two data records. Then, LSH with the MinHash function family can be employed to divide datasets into multiple partitions for use with MapReduce to parallelize computation while preserving similarity. By using our efficient LSH-based scheme, we can anonymize each partition through the use of a recursive agglomerative \$k\$-member clustering algorithm. Extensive experiments on real-life datasets show that our approach significantly improves the scalability and time-efficiency of local-recoding anonymization by orders of magnitude over existing approaches.