Biblio
In local differential privacy (LDP), each user perturbs her data locally before sending the noisy data to a data collector. The latter then analyzes the data to obtain useful statistics. Unlike the setting of centralized differential privacy, in LDP the data collector never gains access to the exact values of sensitive data, which protects not only the privacy of data contributors but also the collector itself against the risk of potential data leakage. Existing LDP solutions in the literature are mostly limited to the case that each user possesses a tuple of numeric or categorical values, and the data collector computes basic statistics such as counts or mean values. To the best of our knowledge, no existing work tackles more complex data mining tasks such as heavy hitter discovery over set-valued data. In this paper, we present a systematic study of heavy hitter mining under LDP. We first review existing solutions, extend them to the heavy hitter estimation, and explain why their effectiveness is limited. We then propose LDPMiner, a two-phase mechanism for obtaining accurate heavy hitters with LDP. The main idea is to first gather a candidate set of heavy hitters using a portion of the privacy budget, and focus the remaining budget on refining the candidate set in a second phase, which is much more efficient budget-wise than obtaining the heavy hitters directly from the whole dataset. We provide both in-depth theoretical analysis and extensive experiments to compare LDPMiner against adaptations of previous solutions. The results show that LDPMiner significantly improves over existing methods. More importantly, LDPMiner successfully identifies the majority true heavy hitters in practical settings.
The image and multimedia data produced by individuals and enterprises is increasing every day. Motivated by the advances in cloud computing, there is a growing need to outsource such computational intensive image feature detection tasks to cloud for its economic computing resources and on-demand ubiquitous access. However, the concerns over the effective protection of private image and multimedia data when outsourcing it to cloud platform become the major barrier that impedes the further implementation of cloud computing techniques over massive amount of image and multimedia data. To address this fundamental challenge, we study the state-of-the-art image feature detection algorithms and focus on Scalar Invariant Feature Transform (SIFT), which is one of the most important local feature detection algorithms and has been broadly employed in different areas, including object recognition, image matching, robotic mapping, and so on. We analyze and model the privacy requirements in outsourcing SIFT computation and propose Secure Scalar Invariant Feature Transform (SecSIFT), a high-performance privacy-preserving SIFT feature detection system. In contrast to previous works, the proposed design is not restricted by the efficiency limitations of current homomorphic encryption scheme. In our design, we decompose and distribute the computation procedures of the original SIFT algorithm to a set of independent, co-operative cloud servers and keep the outsourced computation procedures as simple as possible to avoid utilizing a computationally expensive homomorphic encryption scheme. The proposed SecSIFT enables implementation with practical computation and communication complexity. Extensive experimental results demonstrate that SecSIFT performs comparably to original SIFT on image benchmarks while capable of preserving the privacy in an efficient way.