Biblio
Collecting and analyzing personal data is important in modern information applications. Though the privacy of data providers should be protected, some adversarial users may behave badly under circumstances where they are not identified. However, the privacy of honest users should not be infringed. Thus, detecting anomalies without revealing normal users-identities is quite important for operating information systems using personal data. Though various methods of statistics and machine learning have been developed for detecting anomalies, it is difficult to know in advance what anomaly will come up. Thus, it would be useful to provide a "general" framework that can employ any anomaly detection method regardless of the type of data and the nature of the abnormality. In this paper, we propose a privacy preserving anomaly detection framework that allows an authority to detect adversarial users while other honest users are kept anonymous. By using cryptographic techniques, group signatures with message-dependent opening (GS-MDO) and public key encryption with non-interactive opening (PKENO), we provide a correspondence table that links a user and data in a secure way, and we can employ any anonymization technique and any anomaly detection method. It is particularly worth noting that no big brother exists, meaning that no single entity can identify users, while bad behaviors are always traceable. We also show the result of implementing our framework. Briefly, the overhead of our framework is on the order of dozens of milliseconds.
Many enterprises are transitioning towards data-driven business processes. There are numerous situations where multiple parties would like to share data towards a common goal if it were possible to simultaneously protect the privacy and security of the individuals and organizations described in the data. Existing solutions for multi-party analytics that follow the so called Data Lake paradigm have parties transfer their raw data to a trusted third-party (i.e., mediator), which then performs the desired analysis on the global data, and shares the results with the parties. However, such a solution does not fit many applications such as Healthcare, Finance, and the Internet-of-Things, where privacy is a strong concern. Motivated by the increasing demands for data privacy, we study the problem of privacy-preserving multi-party data analytics, where the goal is to enable analytics on multi-party data without compromising the data privacy of each individual party. In this paper, we first propose a secure sum protocol with strong security guarantees. The proposed secure sum protocol is resistant to collusion attacks even with N-2 parties colluding, where N denotes the total number of collaborating parties. We then use this protocol to propose two secure gradient descent algorithms, one for horizontally partitioned data, and the other for vertically partitioned data. The proposed framework is generic and applies to a wide class of machine learning problems. We demonstrate our solution for two popular use-cases, regression and classification, and evaluate the performance of the proposed solution in terms of the obtained model accuracy, latency and communication cost. In addition, we perform a scalability analysis to evaluate the performance of the proposed solution as the data size and the number of parties increase.