Biblio | CPS-VO

Zeighami, Sepanta, Ghinita, Gabriel, Shahabi, Cyrus. 2021. Secure Dynamic Skyline Queries Using Result Materialization. 2021 IEEE 37th International Conference on Data Engineering (ICDE). :157—168.

Skyline computation is an increasingly popular query, with broad applicability to many domains. Given the trend to outsource databases, and due to the sensitive nature of the data (e.g., in healthcare), it is essential to evaluate skylines on encrypted datasets. Research efforts acknowledged the importance of secure skyline computation, but existing solutions suffer from several shortcomings: (i) they only provide ad-hoc security; (ii) they are prohibitively expensive; or (iii) they rely on assumptions such as the presence of multiple non-colluding parties in the protocol. Inspired by solutions for secure nearest-neighbors, we conjecture that a secure and efficient way to compute skylines is through result materialization. However, materialization is much more challenging for skylines queries due to large space requirements. We show that pre-computing skyline results while minimizing storage overhead is NP-hard, and we provide heuristics that solve the problem more efficiently, while maintaining storage at reasonable levels. Our algorithms are novel and also applicable to regular skyline computation, but we focus on the encrypted setting where materialization reduces the response time of skyline queries from hours to seconds. Extensive experiments show that we clearly outperform existing work in terms of performance, and our security analysis proves that we obtain a small (and quantifiable) data leakage.

To, Hien, Shahabi, Cyrus, Xiong, Li. 2018. Privacy-Preserving Online Task Assignment in Spatial Crowdsourcing with Untrusted Server. 2018 IEEE 34th International Conference on Data Engineering (ICDE). :833–844.

With spatial crowdsourcing (SC), requesters outsource their spatiotemporal tasks (tasks associated with location and time) to a set of workers, who will perform the tasks by physically traveling to the tasks' locations. However, current solutions require the locations of the workers and/or the tasks to be disclosed to untrusted parties (SC server) for effective assignments of tasks to workers. In this paper we propose a framework for assigning tasks to workers in an online manner without compromising the location privacy of workers and tasks. We perturb the locations of both tasks and workers based on geo-indistinguishability and then devise techniques to quantify the probability of reachability between a task and a worker, given their perturbed locations. We investigate both analytical and empirical models for quantifying the worker-task pair reachability and propose task assignment strategies that strike a balance among various metrics such as the number of completed tasks, worker travel distance and system overhead. Extensive experiments on real-world datasets show that our proposed techniques result in minimal disclosure of task locations and no disclosure of worker locations without significantly sacrificing the total number of assigned tasks.

To, Hien, Shahabi, Cyrus, Xiong, Li. 2018. Privacy-Preserving Online Task Assignment in Spatial Crowdsourcing with Untrusted Server. 2018 IEEE 34th International Conference on Data Engineering (ICDE). :833–844.

With spatial crowdsourcing (SC), requesters outsource their spatiotemporal tasks (tasks associated with location and time) to a set of workers, who will perform the tasks by physically traveling to the tasks' locations. However, current solutions require the locations of the workers and/or the tasks to be disclosed to untrusted parties (SC server) for effective assignments of tasks to workers. In this paper we propose a framework for assigning tasks to workers in an online manner without compromising the location privacy of workers and tasks. We perturb the locations of both tasks and workers based on geo-indistinguishability and then devise techniques to quantify the probability of reachability between a task and a worker, given their perturbed locations. We investigate both analytical and empirical models for quantifying the worker-task pair reachability and propose task assignment strategies that strike a balance among various metrics such as the number of completed tasks, worker travel distance and system overhead. Extensive experiments on real-world datasets show that our proposed techniques result in minimal disclosure of task locations and no disclosure of worker locations without significantly sacrificing the total number of assigned tasks.

To, Hien, Nguyen, Kien, Shahabi, Cyrus. 2016. Differentially Private Publication of Location Entropy. Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. :35:1–35:10.

Location entropy (LE) is a popular metric for measuring the popularity of various locations (e.g., points-of-interest). Unlike other metrics computed from only the number of (unique) visits to a location, namely frequency, LE also captures the diversity of the users' visits, and is thus more accurate than other metrics. Current solutions for computing LE require full access to the past visits of users to locations, which poses privacy threats. This paper discusses, for the first time, the problem of perturbing location entropy for a set of locations according to differential privacy. The problem is challenging because removing a single user from the dataset will impact multiple records of the database; i.e., all the visits made by that user to various locations. Towards this end, we first derive non-trivial, tight bounds for both local and global sensitivity of LE, and show that to satisfy ε-differential privacy, a large amount of noise must be introduced, rendering the published results useless. Hence, we propose a thresholding technique to limit the number of users' visits, which significantly reduces the perturbation error but introduces an approximation error. To achieve better utility, we extend the technique by adopting two weaker notions of privacy: smooth sensitivity (slightly weaker) and crowd-blending (strictly weaker). Extensive experiments on synthetic and real-world datasets show that our proposed techniques preserve original data distribution without compromising location privacy.