Visible to the public Biblio

Filters: Keyword is privacy preserving data mining  [Clear All Filters]
2018-09-28
Cao, H., Liu, S., Zhao, R., Gu, H., Bao, J., Zhu, L..  2017.  A Privacy Preserving Model for Energy Internet Base on Differential Privacy. 2017 IEEE International Conference on Energy Internet (ICEI). :204–209.

Comparing with the traditional grid, energy internet will collect data widely and connect more broader. The analysis of electrical data use of Non-intrusive Load Monitoring (NILM) can infer user behavior privacy. Consideration both data security and availability is a problem must be addressed. Due to its rigid and provable privacy guarantee, Differential Privacy has proverbially reached and applied to privacy preserving data release and data mining. Because of its high sensitivity, increases the noise directly will led to data unavailable. In this paper, we propose a differentially private mechanism to protect energy internet privacy. Our focus is the aggregated data be released by data owner after added noise in disaggregated data. The theoretically proves and experiments show that our scheme can achieve the purpose of privacy-preserving and data availability.

2018-09-12
Doan, Khue, Quang, Minh Nguyen, Le, Bac.  2017.  Applied Cuckoo Algorithm for Association Rule Hiding Problem. Proceedings of the Eighth International Symposium on Information and Communication Technology. :26–33.
Nowadays, the database security problem is becoming significantly interesting in the data mining field. How can exploit legitimate data and avoid disclosing sensitive information. There have been many approaches in which the outstanding solution among them is privacy preservation in association rule mining to hide sensitive rules. In the recent years, a meta-heuristic algorithm is becoming effective for this goal, the algorithm is applied in the cuckoo optimization algorithm (COA4ARH). In this paper, an improved proposal of the COA4ARH to minimize the side effect of the missing non-sensitive rules will be introduced. The main contribution of this study is a new pre-process stage to determine the minimum number of necessary transactions for the process of initializing an initial habitat, thus restriction of modified operation on the original data. To evaluate the effectiveness of the proposed method, we conducted several experiments on the real datasets. The experimental results show that the improved approach has higher performance in compared to the original algorithm.
Nagaratna, M., Sowmya, Y..  2017.  M-sanit: Computing misusability score and effective sanitization of big data using Amazon elastic MapReduce. 2017 International Conference on Computation of Power, Energy Information and Commuincation (ICCPEIC). :029–035.
The invent of distributed programming frameworks like Hadoop paved way for processing voluminous data known as big data. Due to exponential growth of data, enterprises started to exploit the availability of cloud infrastructure for storing and processing big data. Insider attacks on outsourced data causes leakage of sensitive data. Therefore, it is essential to sanitize data so as to preserve privacy or non-disclosure of sensitive data. Privacy Preserving Data Publishing (PPDP) and Privacy Preserving Data Mining (PPDM) are the areas in which data sanitization plays a vital role in preserving privacy. The existing anonymization techniques for MapReduce programming can be improved to have a misusability measure for determining the level of sanitization to be applied to big data. To overcome this limitation we proposed a framework known as M-Sanit which has mechanisms to exploit misusability score of big data prior to performing sanitization using MapReduce programming paradigm. Our empirical study using the real world cloud eco system such as Amazon Elastic Cloud Compute (EC2) and Amazon Elastic MapReduce (EMR) reveals the effectiveness of misusability score based sanitization of big data prior to publishing or mining it.
2015-05-05
Jandel, M., Svenson, P., Johansson, R..  2014.  Fusing restricted information. Information Fusion (FUSION), 2014 17th International Conference on. :1-9.

Information fusion deals with the integration and merging of data and information from multiple (heterogeneous) sources. In many cases, the information that needs to be fused has security classification. The result of the fusion process is then by necessity restricted with the strictest information security classification of the inputs. This has severe drawbacks and limits the possible dissemination of the fusion results. It leads to decreased situational awareness: the organization knows information that would enable a better situation picture, but since parts of the information is restricted, it is not possible to distribute the most correct situational information. In this paper, we take steps towards defining fusion and data mining processes that can be used even when all the underlying data that was used cannot be disseminated. The method we propose here could be used to produce a classifier where all the sensitive information has been removed and where it can be shown that an antagonist cannot even in principle obtain knowledge about the classified information by using the classifier or situation picture.
 

Lei Xu, Chunxiao Jiang, Jian Wang, Jian Yuan, Yong Ren.  2014.  Information Security in Big Data: Privacy and Data Mining. Access, IEEE. 2:1149-1176.

The growing popularity and development of data mining technologies bring serious threat to the security of individual,'s sensitive information. An emerging research topic in data mining, known as privacy-preserving data mining (PPDM), has been extensively studied in recent years. The basic idea of PPDM is to modify the data in such a way so as to perform data mining algorithms effectively without compromising the security of sensitive information contained in the data. Current studies of PPDM mainly focus on how to reduce the privacy risk brought by data mining operations, while in fact, unwanted disclosure of sensitive information may also happen in the process of data collecting, data publishing, and information (i.e., the data mining results) delivering. In this paper, we view the privacy issues related to data mining from a wider perspective and investigate various approaches that can help to protect sensitive information. In particular, we identify four different types of users involved in data mining applications, namely, data provider, data collector, data miner, and decision maker. For each type of user, we discuss his privacy concerns and the methods that can be adopted to protect sensitive information. We briefly introduce the basics of related research topics, review state-of-the-art approaches, and present some preliminary thoughts on future research directions. Besides exploring the privacy-preserving approaches for each type of user, we also review the game theoretical approaches, which are proposed for analyzing the interactions among different users in a data mining scenario, each of whom has his own valuation on the sensitive information. By differentiating the responsibilities of different users with respect to security of sensitive information, we would like to provide some useful insights into the study of PPDM.