Visible to the public Privacy-preserving Search and Computation for Cloud Data

ABSTRACT

Cloud computing enables convenient, on-demand network access to a centralized pool of configurable computing resources, e.g., networks, servers, applications, and services, that can be rapidly deployed with great efficiency and minimal management overhead. And therefore, cloud computing serves as natural hub hosting massive data continuously generated by the Internet and social media, which take various different forms, e.g., text, picture, multimedia, etc. In recent years, numerous cloud services are being deployed adopting such a model. While the merits of cloud services can be easily perceived, their security and privacy risks still largely remain a challenge. For the data that are congregated in the cloud, they might include sensitive personal, business, and organizational information, whose nature prohibits their uncontrolled access. In scenarios like these, many solutions rely on encryption services helping protect data confidentiality. Data encryption, however, drastically decreases data utility and makes effective data search and retrieval a very challenging task. This necessitates the need for developing effective searching techniques over encrypted cloud data of massive scale. Such techniques should enable critical search functionalities that have long been enjoyed in modern search engine over unencrypted data, like Google, Bing, etc. The adequacy of such techniques is essential to the long- term success of the cloud services and the ultimate privacy protection of both individuals and organizations.

Current technologies in this domain are still in the naive stage. Most existing searchable encryption techniques support only simple keyword/predicate matching functions by pre-building a secure searching index or embedding a predicate in the encryption algorithm. These techniques were traditionally developed by the cryptographic community with only simple text data in mind and are in general very limited in functionality and perform poorly in providing usability, scalability, and performance. In order to address the challenges, expertise from different communities including cryptography, security, database, information retrieval, networking, algorithms, and distributed systems, should join together. Below is a list of open problems that need immediate attention: 1) How do we enable versatile keyword search over encrypted data that is highly usable, including the functionalities such as fuzzy tolerance, result ranking, muli-keywords search, similarity search, etc.? The ultimate goal is to enable rich search semantics in a privacy preserving manner and efficiently support for large- scale and distributed nature of cloud data. 2) How do we go beyond text data? How do we enable privacy preserving graph search, image search, and/or multimedia search? Efficient techniques for search high- dimensional data have to be developed. 3) How do we effectively control the access to the search results? Enabling scalable and fine-grained control to the search results is indispensable to privacy-preserving search for cloud data, given the huge user base that may potentially be individually differentiated [1,2].

Cloud data are also being frequently computed/processed for the data mining purpose. Many cloud services mine their user data for various purposes ranging from performance optimization, user experience improvement to targeted advertising. These purposes often contradict users' privacy policies. Furthermore, business and organizations may resort to cloud services for resource-extensive computing tasks, e.g., bio-computing, whereas both the data and computing results therein need to be protected from others including the cloud [1]. It is therefore highly critical to develop privacy-preserving and proof-carrying computation and data mining mechanisms that suit for large-scale applications. Moreover, fundamental to the success of any solutions of such is the performance. Theoretically, we can always rely on the cryptographic holy grail, fully homomorphic encryption (FHE), to construct a universal solution that is perfectly secure. The performance of FHE is however totally unacceptable as of today or in the near future. On the other side, one approach is to understand the nature of an application and its security requirements and develop application-specific solutions that are highly customized and achieve desirable trade-offs among privacy protection, performance, and other factors. Great efforts are in need to proceed along the path of this direction.

[1] K. Ren, C. Wang, and Q. Wang, Security Challenges for the Public Cloud, IEEE Internet Computing, Vol. 16, No. 1, pp. 69-73, Jan/Feb, 2012
[2] K. Ren, C. Wang, and Q. Wang, Toward Secure and Effective Data Utilization in Public Cloud, IEEE Network, 2012, to appear

Award ID: 1262277

License: 
Creative Commons 2.5

Other available formats:

Privacy-preserving Search and Computation for Cloud Data
Switch to experimental viewer