Biblio
Data outsourcing to cloud has been a common IT practice nowadays due to its significant benefits. Meanwhile, security and privacy concerns are critical obstacles to hinder the further adoption of cloud. Although data encryption can mitigate the problem, it reduces the functionality of query processing, e.g., disabling SQL queries. Several schemes have been proposed to enable one-dimensional query on encrypted data, but multi-dimensional range query has not been well addressed. In this paper, we propose a secure and scalable scheme that can support multi-dimensional range queries over encrypted data. The proposed scheme has three salient features: (1) Privacy: the server cannot learn the contents of queries and data records during query processing. (2) Efficiency: we utilize hierarchical cubes to encode multi-dimensional data records and construct a secure tree index on top of such encoding to achieve sublinear query time. (3) Verifiability: our scheme allows users to verify the correctness and completeness of the query results to address server's malicious behaviors. We perform formal security analysis and comprehensive experimental evaluations. The results on real datasets demonstrate that our scheme achieves practical performance while guaranteeing data privacy and result integrity.
Computing similarity, especially Jaccard Similarity, between two datasets is a fundamental building block in big data analytics, and extensive applications including genome matching, plagiarism detection, social networking, etc. The increasing user privacy concerns over the release of has sensitive data have made it desirable and necessary for two users to evaluate Jaccard Similarity over their datasets in a privacy-preserving manner. In this paper, we propose two efficient and secure protocols to compute the Jaccard Similarity of two users' private sets with the help of an unfully-trusted server. Specifically, in order to boost the efficiency, we leverage Minhashing algorithm on encrypted data, where the output of our protocols is guaranteed to be a close approximation of the exact value. In both protocols, only an approximate similarity result is leaked to the server and users. The first protocol is secure against a semi-honest server, while the second protocol, with a novel consistency-check mechanism, further achieves result verifiability against a malicious server who cheats in the executions. Experimental results show that our first protocol computes an approximate Jaccard Similarity of two billion-element sets within only 6 minutes (under 256-bit security in parallel mode). To the best of our knowledge, our consistency-check mechanism represents the very first work to realize an efficient verification particularly on approximate similarity computation.