Visible to the public Biblio

Filters: Keyword is local differential privacy  [Clear All Filters]
2023-06-30
Shejy, Geocey, Chavan, Pallavi.  2022.  Sensitivity Support in Data Privacy Algorithms. 2022 2nd Asian Conference on Innovation in Technology (ASIANCON). :1–4.
Personal data privacy is a great concern by governments across the world as citizens generate huge amount of data continuously and industries using this for betterment of user centric services. There must be a reasonable balance between data privacy and utility of data. Differential privacy is a promise by data collector to the customer’s personal privacy. Centralised Differential Privacy (CDP) is performing output perturbation of user’s data by applying required privacy budget. This promises the inclusion or exclusion of individual’s data in data set not going to create significant change for a statistical query output and it offers -Differential privacy guarantee. CDP is holding a strong belief on trusted data collector and applying global sensitivity of the data. Local Differential Privacy (LDP) helps user to locally perturb his data and there by guaranteeing privacy even with untrusted data collector. Many differential privacy algorithms handles parameters like privacy budget, sensitivity and data utility in different ways and mostly trying to keep trade-off between privacy and utility of data. This paper evaluates differential privacy algorithms in regard to the privacy support it offers according to the sensitivity of the data. Generalized application of privacy budget is found ineffective in comparison to the sensitivity based usage of privacy budget.
2022-04-26
Qin, Desong, Zhang, Zhenjiang.  2021.  A Frequency Estimation Algorithm under Local Differential Privacy. 2021 15th International Conference on Ubiquitous Information Management and Communication (IMCOM). :1–5.

With the rapid development of 5G, the Internet of Things (IoT) and edge computing technologies dramatically improve smart industries' efficiency, such as healthcare, smart agriculture, and smart city. IoT is a data-driven system in which many smart devices generate and collect a massive amount of user privacy data, which may be used to improve users' efficiency. However, these data tend to leak personal privacy when people send it to the Internet. Differential privacy (DP) provides a method for measuring privacy protection and a more flexible privacy protection algorithm. In this paper, we study an estimation problem and propose a new frequency estimation algorithm named MFEA that redesigns the publish process. The algorithm maps a finite data set to an integer range through a hash function, then initializes the data vector according to the mapped value and adds noise through the randomized response. The frequency of all interference data is estimated with maximum likelihood. Compared with the current traditional frequency estimation, our approach achieves better algorithm complexity and error control while satisfying differential privacy protection (LDP).

2021-12-02
Gai, Na, Xue, Kaiping, He, Peixuan, Zhu, Bin, Liu, Jianqing, He, Debiao.  2020.  An Efficient Data Aggregation Scheme with Local Differential Privacy in Smart Grid. 2020 16th International Conference on Mobility, Sensing and Networking (MSN). :73–80.
Smart grid achieves reliable, efficient and flexible grid data processing by integrating traditional power grid with information and communication technology. The control center can evaluate the supply and demand of the power grid through aggregated data of users, and then dynamically adjust the power supply, price of the power, etc. However, since the grid data collected from users may disclose the user's electricity using habits and daily activities, the privacy concern has become a critical issue. Most of the existing privacy-preserving data collection schemes for smart grid adopt homomorphic encryption or randomization techniques which are either impractical because of the high computation overhead or unrealistic for requiring the trusted third party. In this paper, we propose a privacy-preserving smart grid data aggregation scheme satisfying local differential privacy (LDP) based on randomized response. Our scheme can achieve efficient and practical estimation of the statistics of power supply and demand while preserving any individual participant's privacy. The performance analysis shows that our scheme is efficient in terms of computation and communication overhead.
2020-10-16
Liu, Liping, Piao, Chunhui, Jiang, Xuehong, Zheng, Lijuan.  2018.  Research on Governmental Data Sharing Based on Local Differential Privacy Approach. 2018 IEEE 15th International Conference on e-Business Engineering (ICEBE). :39—45.

With the construction and implementation of the government information resources sharing mechanism, the protection of citizens' privacy has become a vital issue for government departments and the public. This paper discusses the risk of citizens' privacy disclosure related to data sharing among government departments, and analyzes the current major privacy protection models for data sharing. Aiming at the issues of low efficiency and low reliability in existing e-government applications, a statistical data sharing framework among governmental departments based on local differential privacy and blockchain is established, and its applicability and advantages are illustrated through example analysis. The characteristics of the private blockchain enhance the security, credibility and responsiveness of information sharing between departments. Local differential privacy provides better usability and security for sharing statistics. It not only keeps statistics available, but also protects the privacy of citizens.

2020-10-05
Joseph, Matthew, Mao, Jieming, Neel, Seth, Roth, Aaron.  2019.  The Role of Interactivity in Local Differential Privacy. 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS). :94—105.

We study the power of interactivity in local differential privacy. First, we focus on the difference between fully interactive and sequentially interactive protocols. Sequentially interactive protocols may query users adaptively in sequence, but they cannot return to previously queried users. The vast majority of existing lower bounds for local differential privacy apply only to sequentially interactive protocols, and before this paper it was not known whether fully interactive protocols were more powerful. We resolve this question. First, we classify locally private protocols by their compositionality, the multiplicative factor by which the sum of a protocol's single-round privacy parameters exceeds its overall privacy guarantee. We then show how to efficiently transform any fully interactive compositional protocol into an equivalent sequentially interactive protocol with a blowup in sample complexity linear in this compositionality. Next, we show that our reduction is tight by exhibiting a family of problems such that any sequentially interactive protocol requires this blowup in sample complexity over a fully interactive compositional protocol. We then turn our attention to hypothesis testing problems. We show that for a large class of compound hypothesis testing problems - which include all simple hypothesis testing problems as a special case - a simple noninteractive test is optimal among the class of all (possibly fully interactive) tests.

2020-01-06
Cormode, Graham, Jha, Somesh, Kulkarni, Tejas, Li, Ninghui, Srivastava, Divesh, Wang, Tianhao.  2018.  Privacy at Scale: Local Differential Privacy in Practice. Proceedings of the 2018 International Conference on Management of Data. :1655–1658.
Local differential privacy (LDP), where users randomly perturb their inputs to provide plausible deniability of their data without the need for a trusted party, has been adopted recently by several major technology organizations, including Google, Apple and Microsoft. This tutorial aims to introduce the key technical underpinnings of these deployed systems, to survey current research that addresses related problems within the LDP model, and to identify relevant open problems and research directions for the community.
2018-09-28
Lu, Z., Shen, H..  2017.  A New Lower Bound of Privacy Budget for Distributed Differential Privacy. 2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT). :25–32.

Distributed data aggregation via summation (counting) helped us to learn the insights behind the raw data. However, such computing suffered from a high privacy risk of malicious collusion attacks. That is, the colluding adversaries infer a victim's privacy from the gaps between the aggregation outputs and their source data. Among the solutions against such collusion attacks, Distributed Differential Privacy (DDP) shows a significant effect of privacy preservation. Specifically, a DDP scheme guarantees the global differential privacy (the presence or absence of any data curator barely impacts the aggregation outputs) by ensuring local differential privacy at the end of each data curator. To guarantee an overall privacy performance of a distributed data aggregation system against malicious collusion attacks, part of the existing work on such DDP scheme aim to provide an estimated lower bound of privacy budget for the global differential privacy. However, there are two main problems: low data utility from using a large global function sensitivity; unknown privacy guarantee when the aggregation sensitivity of the whole system is less than the sum of the data curator's aggregation sensitivity. To address these problems while ensuring distributed differential privacy, we provide a new lower bound of privacy budget, which works with an unconditional aggregation sensitivity of the whole distributed system. Moreover, we study the performance of our privacy bound in different scenarios of data updates. Both theoretical and experimental evaluations show that our privacy bound offers better global privacy performance than the existing work.

2017-05-22
Qin, Zhan, Yang, Yin, Yu, Ting, Khalil, Issa, Xiao, Xiaokui, Ren, Kui.  2016.  Heavy Hitter Estimation over Set-Valued Data with Local Differential Privacy. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. :192–203.

In local differential privacy (LDP), each user perturbs her data locally before sending the noisy data to a data collector. The latter then analyzes the data to obtain useful statistics. Unlike the setting of centralized differential privacy, in LDP the data collector never gains access to the exact values of sensitive data, which protects not only the privacy of data contributors but also the collector itself against the risk of potential data leakage. Existing LDP solutions in the literature are mostly limited to the case that each user possesses a tuple of numeric or categorical values, and the data collector computes basic statistics such as counts or mean values. To the best of our knowledge, no existing work tackles more complex data mining tasks such as heavy hitter discovery over set-valued data. In this paper, we present a systematic study of heavy hitter mining under LDP. We first review existing solutions, extend them to the heavy hitter estimation, and explain why their effectiveness is limited. We then propose LDPMiner, a two-phase mechanism for obtaining accurate heavy hitters with LDP. The main idea is to first gather a candidate set of heavy hitters using a portion of the privacy budget, and focus the remaining budget on refining the candidate set in a second phase, which is much more efficient budget-wise than obtaining the heavy hitters directly from the whole dataset. We provide both in-depth theoretical analysis and extensive experiments to compare LDPMiner against adaptations of previous solutions. The results show that LDPMiner significantly improves over existing methods. More importantly, LDPMiner successfully identifies the majority true heavy hitters in practical settings.

Krishnan, Sanjay, Wang, Jiannan, Franklin, Michael J., Goldberg, Ken, Kraska, Tim.  2016.  PrivateClean: Data Cleaning and Differential Privacy. Proceedings of the 2016 International Conference on Management of Data. :937–951.

Recent advances in differential privacy make it possible to guarantee user privacy while preserving the main characteristics of the data. However, most differential privacy mechanisms assume that the underlying dataset is clean. This paper explores the link between data cleaning and differential privacy in a framework we call PrivateClean. PrivateClean includes a technique for creating private datasets of numerical and discrete-valued attributes, a formalism for privacy-preserving data cleaning, and techniques for answering sum, count, and avg queries after cleaning. We show: (1) how the degree of privacy affects subsequent aggregate query accuracy, (2) how privacy potentially amplifies certain types of errors in a dataset, and (3) how this analysis can be used to tune the degree of privacy. The key insight is to maintain a bipartite graph relating dirty values to clean values and use this graph to estimate biases due to the interaction between cleaning and privacy. We validate these results on four datasets with a variety of well-studied cleaning techniques including using functional dependencies, outlier filtering, and resolving inconsistent attributes.

2017-03-07
Krishnan, Sanjay, Wang, Jiannan, Franklin, Michael J., Goldberg, Ken, Kraska, Tim.  2016.  PrivateClean: Data Cleaning and Differential Privacy. Proceedings of the 2016 International Conference on Management of Data. :937–951.

Recent advances in differential privacy make it possible to guarantee user privacy while preserving the main characteristics of the data. However, most differential privacy mechanisms assume that the underlying dataset is clean. This paper explores the link between data cleaning and differential privacy in a framework we call PrivateClean. PrivateClean includes a technique for creating private datasets of numerical and discrete-valued attributes, a formalism for privacy-preserving data cleaning, and techniques for answering sum, count, and avg queries after cleaning. We show: (1) how the degree of privacy affects subsequent aggregate query accuracy, (2) how privacy potentially amplifies certain types of errors in a dataset, and (3) how this analysis can be used to tune the degree of privacy. The key insight is to maintain a bipartite graph relating dirty values to clean values and use this graph to estimate biases due to the interaction between cleaning and privacy. We validate these results on four datasets with a variety of well-studied cleaning techniques including using functional dependencies, outlier filtering, and resolving inconsistent attributes.