Visible to the public Biblio

Filters: Keyword is data utility  [Clear All Filters]
2022-06-14
Yasa, Ray Novita, Buana, I Komang Setia, Girinoto, Setiawan, Hermawan, Hadiprakoso, Raden Budiarto.  2021.  Modified RNP Privacy Protection Data Mining Method as Big Data Security. 2021 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS. :30–34.
Privacy-Preserving Data Mining (PPDM) has become an exciting topic to discuss in recent decades due to the growing interest in big data and data mining. A technique of securing data but still preserving the privacy that is in it. This paper provides an alternative perturbation-based PPDM technique which is carried out by modifying the RNP algorithm. The novelty given in this paper are modifications of some steps method with a specific purpose. The modifications made are in the form of first narrowing the selection of the disturbance value. With the aim that the number of attributes that are replaced in each record line is only as many as the attributes in the original data, no more and no need to repeat; secondly, derive the perturbation function from the cumulative distribution function and use it to find the probability distribution function so that the selection of replacement data has a clear basis. The experiment results on twenty-five perturbed data show that the modified RNP algorithm balances data utility and security level by selecting the appropriate disturbance value and perturbation value. The level of security is measured using privacy metrics in the form of value difference, average transformation of data, and percentage of retains. The method presented in this paper is fascinating to be applied to actual data that requires privacy preservation.
2021-01-28
Esmeel, T. K., Hasan, M. M., Kabir, M. N., Firdaus, A..  2020.  Balancing Data Utility versus Information Loss in Data-Privacy Protection using k-Anonymity. 2020 IEEE 8th Conference on Systems, Process and Control (ICSPC). :158—161.

Data privacy has been an important area of research in recent years. Dataset often consists of sensitive data fields, exposure of which may jeopardize interests of individuals associated with the data. In order to resolve this issue, privacy techniques can be used to hinder the identification of a person through anonymization of the sensitive data in the dataset to protect sensitive information, while the anonymized dataset can be used by the third parties for analysis purposes without obstruction. In this research, we investigated a privacy technique, k-anonymity for different values of on different number columns of the dataset. Next, the information loss due to k-anonymity is computed. The anonymized files go through the classification process by some machine-learning algorithms i.e., Naive Bayes, J48 and neural network in order to check a balance between data anonymity and data utility. Based on the classification accuracy, the optimal values of and are obtained, and thus, the optimal and can be used for k-anonymity algorithm to anonymize optimal number of columns of the dataset.

Wang, N., Song, H., Luo, T., Sun, J., Li, J..  2020.  Enhanced p-Sensitive k-Anonymity Models for Achieving Better Privacy. 2020 IEEE/CIC International Conference on Communications in China (ICCC). :148—153.

To our best knowledge, the p-sensitive k-anonymity model is a sophisticated model to resist linking attacks and homogeneous attacks in data publishing. However, if the distribution of sensitive values is skew, the model is difficult to defend against skew attacks and even faces sensitive attacks. In practice, the privacy requirements of different sensitive values are not always identical. The “one size fits all” unified privacy protection level may cause unnecessary information loss. To address these problems, the paper quantifies privacy requirements with the concept of IDF and concerns more about sensitive groups. Two enhanced anonymous models with personalized protection characteristic, that is, (p,αisg) -sensitive k-anonymity model and (pi,αisg)-sensitive k-anonymity model, are then proposed to resist skew attacks and sensitive attacks. Furthermore, two clustering algorithms with global search and local search are designed to implement our models. Experimental results show that the two enhanced models have outstanding advantages in better privacy at the expense of a little data utility.

2020-12-28
Tojiboev, R., Lee, W., Lee, C. C..  2020.  Adding Noise Trajectory for Providing Privacy in Data Publishing by Vectorization. 2020 IEEE International Conference on Big Data and Smart Computing (BigComp). :432—434.

Since trajectory data is widely collected and utilized for scientific research and business purpose, publishing trajectory without proper privacy-policy leads to an acute threat to individual data. Recently, several methods, i.e., k-anonymity, l-diversity, t-closeness have been studied, though they tend to protect by reducing data depends on a feature of each method. When a strong privacy protection is required, these methods have excessively reduced data utility that may affect the result of scientific research. In this research, we suggest a novel approach to tackle this existing dilemma via an adding noise trajectory on a vector-based grid environment.

2020-07-13
Andrew, J., Karthikeyan, J., Jebastin, Jeffy.  2019.  Privacy Preserving Big Data Publication On Cloud Using Mondrian Anonymization Techniques and Deep Neural Networks. 2019 5th International Conference on Advanced Computing Communication Systems (ICACCS). :722–727.

In recent trends, privacy preservation is the most predominant factor, on big data analytics and cloud computing. Every organization collects personal data from the users actively or passively. Publishing this data for research and other analytics without removing Personally Identifiable Information (PII) will lead to the privacy breach. Existing anonymization techniques are failing to maintain the balance between data privacy and data utility. In order to provide a trade-off between the privacy of the users and data utility, a Mondrian based k-anonymity approach is proposed. To protect the privacy of high-dimensional data Deep Neural Network (DNN) based framework is proposed. The experimental result shows that the proposed approach mitigates the information loss of the data without compromising privacy.

2019-11-04
Beigi, Ghazaleh, Shu, Kai, Zhang, Yanchao, Liu, Huan.  2018.  Securing Social Media User Data: An Adversarial Approach. Proceedings of the 29th on Hypertext and Social Media. :165–173.
Social media users generate tremendous amounts of data. To better serve users, it is required to share the user-related data among researchers, advertisers and application developers. Publishing such data would raise more concerns on user privacy. To encourage data sharing and mitigate user privacy concerns, a number of anonymization and de-anonymization algorithms have been developed to help protect privacy of social media users. In this work, we propose a new adversarial attack specialized for social media data.We further provide a principled way to assess effectiveness of anonymizing different aspects of social media data. Our work sheds light on new privacy risks in social media data due to innate heterogeneity of user-generated data which require striking balance between sharing user data and protecting user privacy.
2018-09-28
Jung, Taebo, Jung, Kangsoo, Park, Sehwa, Park, Seog.  2017.  A noise parameter configuration technique to mitigate detour inference attack on differential privacy. 2017 IEEE International Conference on Big Data and Smart Computing (BigComp). :186–192.

Nowadays, data has become more important as the core resource for the information society. However, with the development of data analysis techniques, the privacy violation such as leakage of sensitive data and personal identification exposure are also increasing. Differential privacy is the technique to satisfy the requirement that any additional information should not be disclosed except information from the database itself. It is well known for protecting the privacy from arbitrary attack. However, recent research argues that there is a several ways to infer sensitive information from data although the differential privacy is applied. One of this inference method is to use the correlation between the data. In this paper, we investigate the new privacy threats using attribute correlation which are not covered by traditional studies and propose a privacy preserving technique that configures the differential privacy's noise parameter to solve this new threat. In the experiment, we show the weaknesses of traditional differential privacy method and validate that the proposed noise parameter configuration method provide a sufficient privacy protection and maintain an accuracy of data utility.

Hu, J., Shi, W., Liu, H., Yan, J., Tian, Y., Wu, Z..  2017.  Preserving Friendly-Correlations in Uncertain Graphs Using Differential Privacy. 2017 International Conference on Networking and Network Applications (NaNA). :24–29.

It is a challenging problem to preserve the friendly-correlations between individuals when publishing social-network data. To alleviate this problem, uncertain graph has been presented recently. The main idea of uncertain graph is converting an original graph into an uncertain form, where the correlations between individuals is an associated probability. However, the existing methods of uncertain graph lack rigorous guarantees of privacy and rely on the assumption of adversary's knowledge. In this paper we first introduced a general model for constructing uncertain graphs. Then, we proposed an algorithm under the model which is based on differential privacy and made an analysis of algorithm's privacy. Our algorithm provides rigorous guarantees of privacy and against the background knowledge attack. Finally, the algorithm we proposed satisfied differential privacy and showed feasibility in the experiments. And then, we compare our algorithm with (k, ε)-obfuscation algorithm in terms of data utility, the importance of nodes for network in our algorithm is similar to (k, ε)-obfuscation algorithm.

2018-09-05
Jia, R., Dong, R., Ganesh, P., Sastry, S., Spanos, C..  2017.  Towards a theory of free-lunch privacy in cyber-physical systems. 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton). :902–910.

Emerging cyber-physical systems (CPS) often require collecting end users' data to support data-informed decision making processes. There has been a long-standing argument as to the tradeoff between privacy and data utility. In this paper, we adopt a multiparametric programming approach to rigorously study conditions under which data utility has to be sacrificed to protect privacy and situations where free-lunch privacy can be achieved, i.e., data can be concealed without hurting the optimality of the decision making underlying the CPS. We formalize the concept of free-lunch privacy, and establish various results on its existence, geometry, as well as efficient computation methods. We propose the free-lunch privacy mechanism, which is a pragmatic mechanism that exploits free-lunch privacy if it exists with the constant guarantee of optimal usage of data. We study the resilience of this mechanism against attacks that attempt to infer the parameter of a user's data generating process. We close the paper by a case study on occupancy-adaptive smart home temperature control to demonstrate the efficacy of the mechanism.

2018-05-30
Wen, M., Zhang, X., Li, H., Li, J..  2017.  A Data Aggregation Scheme with Fine-Grained Access Control for the Smart Grid. 2017 IEEE 86th Vehicular Technology Conference (VTC-Fall). :1–5.

With the rapid development of smart grid, smart meters are deployed at energy consumers' premises to collect real-time usage data. Although such a communication model can help the control center of the energy producer to improve the efficiency and reliability of electricity delivery, it also leads to some security issues. For example, this real-time data involves the customers' privacy. Attackers may violate the privacy for house breaking, or they may tamper with the transmitted data for their own benefits. For this purpose, many data aggregation schemes are proposed for privacy preservation. However, rare of them cares about both the data aggregation and fine-grained access control to improve the data utility. In this paper, we proposes a data aggregation scheme based on attribute decision tree. Security analysis illustrates that our scheme can achieve the data integrity, data privacy preservation and fine- grained data access control. Experiment results show that our scheme are more efficient than existing schemes.

2018-04-02
Gao, Y., Luo, T., Li, J., Wang, C..  2017.  Research on K Anonymity Algorithm Based on Association Analysis of Data Utility. 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). :426–432.

More and more medical data are shared, which leads to disclosure of personal privacy information. Therefore, the construction of medical data privacy preserving publishing model is of great value: not only to make a non-correspondence between the released information and personal identity, but also to maintain the data utility after anonymity. However, there is an inherent contradiction between the anonymity and the data utility. In this paper, a Principal Component Analysis-Grey Relational Analysis (PCA-GRA) K anonymous algorithm is proposed to improve the data utility effectively under the premise of anonymity, in which the association between quasi-identifiers and the sensitive information is reckoned as a criterion to control the generalization hierarchy. Compared with the previous anonymity algorithms, results show that the proposed PCA-GRA K anonymous algorithm has achieved significant improvement in data utility from three aspects, namely information loss, feature maintenance and classification evaluation performance.