A Secure Sum Protocol and Its Application to Privacy-Preserving Multi-Party Analytics
Title | A Secure Sum Protocol and Its Application to Privacy-Preserving Multi-Party Analytics |
Publication Type | Conference Paper |
Year of Publication | 2017 |
Authors | Mehnaz, Shagufta, Bellala, Gowtham, Bertino, Elisa |
Conference Name | Proceedings of the 22Nd ACM on Symposium on Access Control Models and Technologies |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 978-1-4503-4702-0 |
Keywords | Big Data analytics, Data Sanitization, machine learning, Measurement, Metrics, privacy, privacy models, privacy-preserving protocols, pubcrawl |
Abstract | Many enterprises are transitioning towards data-driven business processes. There are numerous situations where multiple parties would like to share data towards a common goal if it were possible to simultaneously protect the privacy and security of the individuals and organizations described in the data. Existing solutions for multi-party analytics that follow the so called Data Lake paradigm have parties transfer their raw data to a trusted third-party (i.e., mediator), which then performs the desired analysis on the global data, and shares the results with the parties. However, such a solution does not fit many applications such as Healthcare, Finance, and the Internet-of-Things, where privacy is a strong concern. Motivated by the increasing demands for data privacy, we study the problem of privacy-preserving multi-party data analytics, where the goal is to enable analytics on multi-party data without compromising the data privacy of each individual party. In this paper, we first propose a secure sum protocol with strong security guarantees. The proposed secure sum protocol is resistant to collusion attacks even with N-2 parties colluding, where N denotes the total number of collaborating parties. We then use this protocol to propose two secure gradient descent algorithms, one for horizontally partitioned data, and the other for vertically partitioned data. The proposed framework is generic and applies to a wide class of machine learning problems. We demonstrate our solution for two popular use-cases, regression and classification, and evaluate the performance of the proposed solution in terms of the obtained model accuracy, latency and communication cost. In addition, we perform a scalability analysis to evaluate the performance of the proposed solution as the data size and the number of parties increase. |
URL | https://dl.acm.org/citation.cfm?doid=3078861.3078869 |
DOI | 10.1145/3078861.3078869 |
Citation Key | mehnaz_secure_2017 |