Biblio
Skyline computation is an increasingly popular query, with broad applicability to many domains. Given the trend to outsource databases, and due to the sensitive nature of the data (e.g., in healthcare), it is essential to evaluate skylines on encrypted datasets. Research efforts acknowledged the importance of secure skyline computation, but existing solutions suffer from several shortcomings: (i) they only provide ad-hoc security; (ii) they are prohibitively expensive; or (iii) they rely on assumptions such as the presence of multiple non-colluding parties in the protocol. Inspired by solutions for secure nearest-neighbors, we conjecture that a secure and efficient way to compute skylines is through result materialization. However, materialization is much more challenging for skylines queries due to large space requirements. We show that pre-computing skyline results while minimizing storage overhead is NP-hard, and we provide heuristics that solve the problem more efficiently, while maintaining storage at reasonable levels. Our algorithms are novel and also applicable to regular skyline computation, but we focus on the encrypted setting where materialization reduces the response time of skyline queries from hours to seconds. Extensive experiments show that we clearly outperform existing work in terms of performance, and our security analysis proves that we obtain a small (and quantifiable) data leakage.
Location entropy (LE) is a popular metric for measuring the popularity of various locations (e.g., points-of-interest). Unlike other metrics computed from only the number of (unique) visits to a location, namely frequency, LE also captures the diversity of the users' visits, and is thus more accurate than other metrics. Current solutions for computing LE require full access to the past visits of users to locations, which poses privacy threats. This paper discusses, for the first time, the problem of perturbing location entropy for a set of locations according to differential privacy. The problem is challenging because removing a single user from the dataset will impact multiple records of the database; i.e., all the visits made by that user to various locations. Towards this end, we first derive non-trivial, tight bounds for both local and global sensitivity of LE, and show that to satisfy ε-differential privacy, a large amount of noise must be introduced, rendering the published results useless. Hence, we propose a thresholding technique to limit the number of users' visits, which significantly reduces the perturbation error but introduces an approximation error. To achieve better utility, we extend the technique by adopting two weaker notions of privacy: smooth sensitivity (slightly weaker) and crowd-blending (strictly weaker). Extensive experiments on synthetic and real-world datasets show that our proposed techniques preserve original data distribution without compromising location privacy.