Biblio
With the development of 5G technology and intelligent terminals, the future direction of the Industrial Internet of Things (IIoT) evolution is Pervasive Edge Computing (PEC). In the pervasive edge computing environment, intelligent terminals can perform calculations and data processing. By migrating part of the original cloud computing model's calculations to intelligent terminals, the intelligent terminal can complete model training without uploading local data to a remote server. Pervasive edge computing solves the problem of data islands and is also successfully applied in scenarios such as vehicle interconnection and video surveillance. However, pervasive edge computing is facing great security problems. Suppose the remote server is honest but curious. In that case, it can still design algorithms for the intelligent terminal to execute and infer sensitive content such as their identity data and private pictures through the information returned by the intelligent terminal. In this paper, we research the problem of honest but curious remote servers infringing intelligent terminal privacy and propose a differential privacy collaborative deep learning algorithm in the pervasive edge computing environment. We use a Gaussian mechanism that meets the differential privacy guarantee to add noise on the first layer of the neural network to protect the data of the intelligent terminal and use analytical moments accountant technology to track the cumulative privacy loss. Experiments show that with the Gaussian mechanism, the training data of intelligent terminals can be protected reduction inaccuracy.
Recently, federated learning (FL), as an advanced and practical solution, has been applied to deal with privacy-preserving issues in distributed multi-party federated modeling. However, most existing FL methods focus on the same privacy-preserving budget while ignoring various privacy requirements of participants. In this paper, we for the first time propose an algorithm (PLU-FedOA) to optimize the deep neural network of horizontal FL with personalized local differential privacy. For such considerations, we design two approaches: PLU, which allows clients to upload local updates under differential privacy-preserving of personally selected privacy level, and FedOA, which helps the server aggregates local parameters with optimized weight in mixed privacy-preserving scenarios. Moreover, we theoretically analyze the effect on privacy and optimization of our approaches. Finally, we verify PLU-FedOA on real-world datasets.
Keystroke dynamics is a behavioural biometric form of authentication based on the inherent typing behaviour of an individual. While this technique is gaining traction, protecting the privacy of the users is of utmost importance. Fully Homomorphic Encryption is a technique that allows performing computation on encrypted data, which enables processing of sensitive data in an untrusted environment. FHE is also known to be “future-proof” since it is a lattice-based cryptosystem that is regarded as quantum-safe. It has seen significant performance improvements over the years with substantially increased developer-friendly tools. We propose a neural network for keystroke analysis trained using differential privacy to speed up training while preserving privacy and predicting on encrypted data using FHE to keep the users' privacy intact while offering sufficient usability.
Federated learning (FL) allows to train a massive amount of data privately due to its decentralized structure. Stochastic gradient descent (SGD) is commonly used for FL due to its good empirical performance, but sensitive user information can still be inferred from weight updates shared during FL iterations. We consider Gaussian mechanisms to preserve local differential privacy (LDP) of user data in the FL model with SGD. The trade-offs between user privacy, global utility, and transmission rate are proved by defining appropriate metrics for FL with LDP. Compared to existing results, the query sensitivity used in LDP is defined as a variable, and a tighter privacy accounting method is applied. The proposed utility bound allows heterogeneous parameters over all users. Our bounds characterize how much utility decreases and transmission rate increases if a stronger privacy regime is targeted. Furthermore, given a target privacy level, our results guarantee a significantly larger utility and a smaller transmission rate as compared to existing privacy accounting methods.
With the development of network, network security has become a topic of increasing concern. Recent years, machine learning technology has become an effective means of network intrusion detection. However, machine learning technology requires a large amount of data for training, and training data often contains privacy information, which brings a great risk of privacy leakage. At present, there are few researches on data privacy protection in the field of intrusion detection. Regarding the issue of privacy and security, we combine differential privacy and machine learning algorithms, including One-class Support Vector Machine (OCSVM) and Local Outlier Factor(LOF), to propose an hybrid intrusion detection system (IDS) with privacy protection. We add Laplacian noise to the original network intrusion detection data set to get differential privacy data sets with different privacy budgets, and proposed a hybrid IDS model based on machine learning to verify their utility. Experiments show that while protecting data privacy, the hybrid IDS can achieve detection accuracy comparable to traditional machine learning algorithms.
Pedestrian dead reckoning (PDR) is a widely used approach to estimate locations and trajectories. Accessing location-based services with trajectory data can bring convenience to people, but may also raise privacy concerns that need to be addressed. In this paper, a privacy-preserving pedestrian dead reckoning framework is proposed to protect a user’s trajectory privacy based on differential privacy. We introduce two metrics to quantify trajectory privacy and data utility. Our proposed privacy-preserving trajectory extraction algorithm consists of three mechanisms for the initial locations, stride lengths and directions. In addition, we design an adversary model based on particle filtering to evaluate the performance and demonstrate the effectiveness of our proposed framework with our collected sensor reading dataset.
Differential privacy is a concept to quantity the disclosure of private information that is controlled by the privacy parameter ε. However, an intuitive interpretation of ε is needed to explain the privacy loss to data engineers and data subjects. In this paper, we conduct a worst-case study of differential privacy risks. We generalize an existing model and reduce complexity to provide more understandable statements on the privacy loss. To this end, we analyze the impact of parameters and introduce the notion of a global privacy risk and global privacy leak.
Current implementations of Differential Privacy (DP) focus primarily on the privacy of the data release. The planned thesis will investigate steps towards a user-centric approach of DP in the scope of the Internet-of-Things (IoT) which focuses on data subjects, IoT developers, and data analysts. We will conduct user studies to find out more about the often conflicting interests of the involved parties and the encountered challenges. Furthermore, a technical solution will be developed to assist data subjects and analysts in making better informed decisions. As a result, we expect our contributions to be a step towards the development of usable DP for IoT sensor data.
With the rapid development of 5G, the Internet of Things (IoT) and edge computing technologies dramatically improve smart industries' efficiency, such as healthcare, smart agriculture, and smart city. IoT is a data-driven system in which many smart devices generate and collect a massive amount of user privacy data, which may be used to improve users' efficiency. However, these data tend to leak personal privacy when people send it to the Internet. Differential privacy (DP) provides a method for measuring privacy protection and a more flexible privacy protection algorithm. In this paper, we study an estimation problem and propose a new frequency estimation algorithm named MFEA that redesigns the publish process. The algorithm maps a finite data set to an integer range through a hash function, then initializes the data vector according to the mapped value and adds noise through the randomized response. The frequency of all interference data is estimated with maximum likelihood. Compared with the current traditional frequency estimation, our approach achieves better algorithm complexity and error control while satisfying differential privacy protection (LDP).
Smart meter devices enable a better understanding of the demand at the potential risk of private information leakage. One promising solution to mitigating such risk is to inject noises into the meter data to achieve a certain level of differential privacy. In this paper, we cast one-shot non-intrusive load monitoring (NILM) in the compressive sensing framework, and bridge the gap between theoretical accuracy of NILM inference and differential privacy's parameters. We then derive the valid theoretical bounds to offer insights on how the differential privacy parameters affect the NILM performance. Moreover, we generalize our conclusions by proposing the hierarchical framework to solve the multishot NILM problem. Numerical experiments verify our analytical results and offer better physical insights of differential privacy in various practical scenarios. This also demonstrates the significance of our work for the general privacy preserving mechanism design.
Noise has been used as a way of protecting privacy of users in public datasets for many decades now. Differential privacy is a new standard to add noise, so that user privacy is protected. When this technique is applied for a single end user data, it's called local differential privacy. In this study, we evaluate the effects of adding noise to generate randomized responses on machine learning models. We generate randomized responses using Gaussian, Laplacian noise on singular end user data as well as correlated end user data. Finally, we provide results that we have observed on a few data sets for various machine learning use cases.