Visible to the public Load balancing for privacy-preserving access to big data in cloud

TitleLoad balancing for privacy-preserving access to big data in cloud
Publication TypeConference Paper
Year of Publication2014
AuthorsPeng Li, Song Guo
Conference NameComputer Communications Workshops (INFOCOM WKSHPS), 2014 IEEE Conference on
Date PublishedApril
KeywordsBig Data, cloud computing, cloud storage, computational complexity, Conferences, data availability, data content protection, data maintenance cost reduction, data management, data placement problem, data privacy, data protection, Data security, Distributed databases, distributed file system, Encryption, file server, file servers, geo-distributed cloud site, information retrieval, load balanced storage system, low-complexity algorithm, NP-hardness, ORAM algorithm, privacy preserving access, Random access memory, random data placement algorithm, random processes, resource allocation, responsiveness, security, Servers, storage management, storage server
Abstract

In the era of big data, many users and companies start to move their data to cloud storage to simplify data management and reduce data maintenance cost. However, security and privacy issues become major concerns because third-party cloud service providers are not always trusty. Although data contents can be protected by encryption, the access patterns that contain important information are still exposed to clouds or malicious attackers. In this paper, we apply the ORAM algorithm to enable privacy-preserving access to big data that are deployed in distributed file systems built upon hundreds or thousands of servers in a single or multiple geo-distributed cloud sites. Since the ORAM algorithm would lead to serious access load unbalance among storage servers, we study a data placement problem to achieve a load balanced storage system with improved availability and responsiveness. Due to the NP-hardness of this problem, we propose a low-complexity algorithm that can deal with large-scale problem size with respect to big data. Extensive simulations are conducted to show that our proposed algorithm finds results close to the optimal solution, and significantly outperforms a random data placement algorithm.

DOI10.1109/INFCOMW.2014.6849286
Citation Key6849286