Visible to the public Biblio

Filters: Keyword is anonymization  [Clear All Filters]
2023-03-31
Shrivastva, Krishna Mohan Pd, Rizvi, M.A., Singh, Shailendra.  2014.  Big Data Privacy Based on Differential Privacy a Hope for Big Data. 2014 International Conference on Computational Intelligence and Communication Networks. :776–781.
In era of information age, due to different electronic, information & communication technology devices and process like sensors, cloud, individual archives, social networks, internet activities and enterprise data are growing exponentially. The most challenging issues are how to effectively manage these large and different type of data. Big data is one of the term named for this large and different type of data. Due to its extraordinary scale, privacy and security is one of the critical challenge of big data. At the every stage of managing the big data there are chances that privacy may be disclose. Many techniques have been suggested and implemented for privacy preservation of large data set like anonymization based, encryption based and others but unfortunately due to different characteristic (large volume, high speed, and unstructured data) of big data all these techniques are not fully suitable. In this paper we have deeply analyzed, discussed and suggested how an existing approach "differential privacy" is suitable for big data. Initially we have discussed about differential privacy and later analyze how it is suitable for big data.
Canbay, Yavuz, Vural, Yilmaz, Sagiroglu, Seref.  2018.  Privacy Preserving Big Data Publishing. 2018 International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism (IBIGDELFT). :24–29.
In order to gain more benefits from big data, they must be shared, published, analyzed and processed without having any harm or facing any violation and finally get better values from these analytics. The literature reports that this analytics brings an issue of privacy violations. This issue is also protected by law and bring fines to the companies, institutions or individuals. As a result, data collectors avoid to publish or share their big data due to these concerns. In order to obtain plausible solutions, there are a number of techniques to reduce privacy risks and to enable publishing big data while preserving privacy at the same time. These are known as privacy-preserving big data publishing (PPBDP) models. This study presents the privacy problem in big data, evaluates big data components from privacy perspective, privacy risks and protection methods in big data publishing, and reviews existing privacy-preserving big data publishing approaches and anonymization methods in literature. The results were finally evaluated and discussed, and new suggestions were presented.
2022-06-08
Giehl, Alexander, Heinl, Michael P., Busch, Maximilian.  2021.  Leveraging Edge Computing and Differential Privacy to Securely Enable Industrial Cloud Collaboration Along the Value Chain. 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE). :2023–2028.
Big data continues to grow in the manufacturing domain due to increasing interconnectivity on the shop floor in the course of the fourth industrial revolution. The optimization of machines based on either real-time or historical machine data provides benefits to both machine producers and operators. In order to be able to make use of these opportunities, it is necessary to access the machine data, which can include sensitive information such as intellectual property. Employing the use case of machine tools, this paper presents a solution enabling industrial data sharing and cloud collaboration while protecting sensitive information. It employs the edge computing paradigm to apply differential privacy to machine data in order to protect sensitive information and simultaneously allow machine producers to perform the necessary calculations and analyses using this data.
2022-02-09
Weng, Jui-Hung, Chi, Po-Wen.  2021.  Multi-Level Privacy Preserving K-Anonymity. 2021 16th Asia Joint Conference on Information Security (AsiaJCIS). :61–67.
k-anonymity is a well-known definition of privacy, which guarantees that any person in the released dataset cannot be distinguished from at least k-1 other individuals. In the protection model, the records are anonymized through generalization or suppression with a fixed value of k. Accordingly, each record has the same level of anonymity in the published dataset. However, different people or items usually have inconsistent privacy requirements. Some records need extra protection while others require a relatively low level of privacy constraint. In this paper, we propose Multi-Level Privacy Preserving K-Anonymity, an advanced protection model based on k-anonymity, which divides records into different groups and requires each group to satisfy its respective privacy requirement. Moreover, we present a practical algorithm using clustering techniques to ensure the property. The evaluation on a real-world dataset confirms that the proposed method has the advantages of offering more flexibility in setting privacy parameters and providing higher data utility than traditional k-anonymity.
2021-06-02
Anbumani, P., Dhanapal, R..  2020.  Review on Privacy Preservation Methods in Data Mining Based on Fuzzy Based Techniques. 2020 2nd International Conference on Advances in Computing, Communication Control and Networking (ICACCCN). :689—694.
The most significant motivation behind calculations in data mining will play out excavation on incomprehensible past examples since the extremely large data size. During late occasions there are numerous phenomenal improvements in data assembling because of the advancement in the field of data innovation. Lately, Privacy issues in data Preservation didn't get a lot of consideration in the process mining network; nonetheless, a few protection safeguarding procedures in data change strategies have been proposed in the data mining network. There are more normal distinction between data mining and cycle mining exist yet there are key contrasts that make protection safeguarding data mining methods inadmissible to mysterious cycle data. Results dependent on the data mining calculation can be utilized in different regions, for example, Showcasing, climate estimating and Picture Examination. It is likewise uncovered that some delicate data has a result of the mining calculation. Here we can safeguard the Privacy by utilizing PPT (Privacy Preservation Techniques) strategies. Important Concept in data mining is privacy preservation Techniques (PPT) because data exchanged between different persons needs security, so that other persons didn't know what actual data transferred between the actual persons. Preservation in data mining deals that not showing the output information / data in the data mining by using various methods while the output data is precious. There are two techniques used for privacy preservation techniques. One is to alter the input information / data and another one is to alter the output information / data. The method is proposed for protection safeguarding in data base environmental factors is data change. This capacity has fuzzy three-sided participation with this strategy for data change to change the first data collection.
2021-05-25
ÇELİK, Mahmut, ALKAN, Mustafa, ALKAN, Abdulkerim Oğuzhan.  2020.  Protection of Personal Data Transmitted via Web Service Against Software Developers. 2020 International Conference on Information Security and Cryptology (ISCTURKEY). :88—92.
Through the widespread use of information technologies, institutions have started to offer most of their services electronically. The best example of this is e-government. Since institutions provide their services to the electronic environment, the quality of the services they provide increases and their access to services becomes easier. Since personal information can be verified with inter-agency information sharing systems, wrong or unfair transactions can be prevented. Since information sharing between institutions is generally done through web services, protection of personal data transmitted via web services is of great importance. There are comprehensive national and international regulations on the protection of personal data. According to these regulations, protection of personal data shared between institutions is a legal obligation; protection of personal data is an issue that needs to be handled comprehensively. This study, protection of personal data shared between institutions through web services against software developers is discussed. With a proposed application, it is aimed to take a new security measure for the protection of personal data. The proposed application consists of a web interface prepared using React and Java programming languages and rest services that provide anonymization of personal data.
2021-05-13
Guan, Bo, Takbiri, Nazanin, Goeckel, Dennis L., Houmansadr, Amir, Pishro-Nik, Hossein.  2020.  Sequence Obfuscation to Thwart Pattern Matching Attacks. 2020 IEEE International Symposium on Information Theory (ISIT). :884—889.

Suppose we are given a large number of sequences on a given alphabet, and an adversary is interested in identifying (de-anonymizing) a specific target sequence based on its patterns. Our goal is to thwart such an adversary by obfuscating the target sequences by applying artificial (but small) distortions to its values. A key point here is that we would like to make no assumptions about the statistical model of such sequences. This is in contrast to existing literature where assumptions (e.g., Markov chains) are made regarding such sequences to obtain privacy guarantees. We relate this problem to a set of combinatorial questions on sequence construction based on which we are able to obtain provable guarantees. This problem is relevant to important privacy applications: from fingerprinting webpages visited by users through anonymous communication systems to linking communicating parties on messaging applications to inferring activities of users of IoT devices.

2021-04-29
Fejrskov, M., Pedersen, J. M., Vasilomanolakis, E..  2020.  Cyber-security research by ISPs: A NetFlow and DNS Anonymization Policy. :1—8.

Internet Service Providers (ISPs) have an economic and operational interest in detecting malicious network activity relating to their subscribers. However, it is unclear what kind of traffic data an ISP has available for cyber-security research, and under which legal conditions it can be used. This paper gives an overview of the challenges posed by legislation and of the data sources available to a European ISP. DNS and NetFlow logs are identified as relevant data sources and the state of the art in anonymization and fingerprinting techniques is discussed. Based on legislation, data availability and privacy considerations, a practically applicable anonymization policy is presented.

2020-08-13
Nosouhi, Mohammad Reza, Yu, Shui, Sood, Keshav, Grobler, Marthie.  2019.  HSDC–Net: Secure Anonymous Messaging in Online Social Networks. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :350—357.
Hiding contents of users' messages has been successfully addressed before, while anonymization of message senders remains a challenge since users do not usually trust ISPs and messaging application providers. To resolve this challenge, several solutions have been proposed so far. Among them, the Dining Cryptographers network protocol (DC-net) provides the strongest anonymity guarantees. However, DC-net suffers from two critical issues that makes it impractical, i.e., (1) collision possibility and (2) vulnerability against disruptions. Apart from that, we noticed a third critical issue during our investigation. (3) DC-net users can be deanonymized after they publish at least three messages. We name this problem the short stability issue and prove that anonymity is provided only for a few cycles of message publishing. As far as we know, this problem has not been identified in the previous research works. In this paper, we propose Harmonized and Stable DC-net (HSDC-net), a self-organizing protocol for anonymous communications. In our protocol design, we first resolve the short stability issue and obtain SDC-net, a stable extension of DC-net. Then, we integrate the Slot Reservation and Disruption Management sub-protocols into SDC-net to overcome the collision and security issues, respectively. The obtained HSDC-net protocol can also be integrated into blockchain-based cryptocurrencies (e.g. Bitcoin) to mix multiple transactions (belonging to different users) into a single transaction in such a way that the source of each payment is unknown. This preserves privacy of blockchain users. Our prototype implementation shows that HSDC-net achieves low latencies that makes it a practical protocol.
Augusto, Cristian, Morán, Jesús, De La Riva, Claudio, Tuya, Javier.  2019.  Test-Driven Anonymization for Artificial Intelligence. 2019 IEEE International Conference On Artificial Intelligence Testing (AITest). :103—110.
In recent years, data published and shared with third parties to develop artificial intelligence (AI) tools and services has significantly increased. When there are regulatory or internal requirements regarding privacy of data, anonymization techniques are used to maintain privacy by transforming the data. The side-effect is that the anonymization may lead to useless data to train and test the AI because it is highly dependent on the quality of the data. To overcome this problem, we propose a test-driven anonymization approach for artificial intelligence tools. The approach tests different anonymization efforts to achieve a trade-off in terms of privacy (non-functional quality) and functional suitability of the artificial intelligence technique (functional quality). The approach has been validated by means of two real-life datasets in the domains of healthcare and health insurance. Each of these datasets is anonymized with several privacy protections and then used to train classification AIs. The results show how we can anonymize the data to achieve an adequate functional suitability in the AI context while maintaining the privacy of the anonymized data as high as possible.
2019-01-31
Simmons, Andrew J., Curumsing, Maheswaree Kissoon, Vasa, Rajesh.  2018.  An Interaction Model for De-Identification of Human Data Held by External Custodians. Proceedings of the 30th Australian Conference on Computer-Human Interaction. :23–26.

Reuse of pre-existing industry datasets for research purposes requires a multi-stakeholder solution that balances the researcher's analysis objectives with the need to engage the industry data custodian, whilst respecting the privacy rights of human data subjects. Current methods place the burden on the data custodian, whom may not be sufficiently trained to fully appreciate the nuances of data de-identification. Through modelling of functional, quality, and emotional goals, we propose a de-identification in the cloud approach whereby the researcher proposes analyses along with the extraction and de-identification operations, while engaging the industry data custodian with secure control over authorising the proposed analyses. We demonstrate our approach through implementation of a de-identification portal for sports club data.

Nakamura, T., Nishi, H..  2018.  TMk-Anonymity: Perturbation-Based Data Anonymization Method for Improving Effectiveness of Secondary Use. IECON 2018 - 44th Annual Conference of the IEEE Industrial Electronics Society. :3138–3143.

The recent emergence of smartphones, cloud computing, and the Internet of Things has brought about the explosion of data creation. By collating and merging these enormous data with other information, services that use information become more sophisticated and advanced. However, at the same time, the consideration of privacy violations caused by such merging is indispensable. Various anonymization methods have been proposed to preserve privacy. The conventional perturbation-based anonymization method of location data adds comparatively larger noise, and the larger noise makes it difficult to utilize the data effectively for secondary use. In this research, to solve these problems, we first clarified the definition of privacy preservation and then propose TMk-anonymity according to the definition.

2018-09-28
Arai, Hiromi, Emura, Keita, Hayashi, Takuya.  2017.  A Framework of Privacy Preserving Anomaly Detection: Providing Traceability Without Big Brother. Proceedings of the 2017 on Workshop on Privacy in the Electronic Society. :111–122.

Collecting and analyzing personal data is important in modern information applications. Though the privacy of data providers should be protected, some adversarial users may behave badly under circumstances where they are not identified. However, the privacy of honest users should not be infringed. Thus, detecting anomalies without revealing normal users-identities is quite important for operating information systems using personal data. Though various methods of statistics and machine learning have been developed for detecting anomalies, it is difficult to know in advance what anomaly will come up. Thus, it would be useful to provide a "general" framework that can employ any anomaly detection method regardless of the type of data and the nature of the abnormality. In this paper, we propose a privacy preserving anomaly detection framework that allows an authority to detect adversarial users while other honest users are kept anonymous. By using cryptographic techniques, group signatures with message-dependent opening (GS-MDO) and public key encryption with non-interactive opening (PKENO), we provide a correspondence table that links a user and data in a secure way, and we can employ any anonymization technique and any anomaly detection method. It is particularly worth noting that no big brother exists, meaning that no single entity can identify users, while bad behaviors are always traceable. We also show the result of implementing our framework. Briefly, the overhead of our framework is on the order of dozens of milliseconds.

2018-09-05
Takbiri, N., Houmansadr, A., Goeckel, D. L., Pishro-Nik, H..  2017.  Limits of location privacy under anonymization and obfuscation. 2017 IEEE International Symposium on Information Theory (ISIT). :764–768.

The prevalence of mobile devices and location-based services (LBS) has generated great concerns regarding the LBS users' privacy, which can be compromised by statistical analysis of their movement patterns. A number of algorithms have been proposed to protect the privacy of users in such systems, but the fundamental underpinnings of such remain unexplored. Recently, the concept of perfect location privacy was introduced and its achievability was studied for anonymization-based LBS systems, where user identifiers are permuted at regular intervals to prevent identification based on statistical analysis of long time sequences. In this paper, we significantly extend that investigation by incorporating the other major tool commonly employed to obtain location privacy: obfuscation, where user locations are purposely obscured to protect their privacy. Since anonymization and obfuscation reduce user utility in LBS systems, we investigate how location privacy varies with the degree to which each of these two methods is employed. We provide: (1) achievability results for the case where the location of each user is governed by an i.i.d. process; (2) converse results for the i.i.d. case as well as the more general Markov Chain model. We show that, as the number of users in the network grows, the obfuscation-anonymization plane can be divided into two regions: in the first region, all users have perfect location privacy; and, in the second region, no user has location privacy.

2018-05-24
Chen, Lin, Xu, Lei, Shah, Nolan, Diallo, Nour, Gao, Zhimin, Lu, Yang, Shi, Weidong.  2017.  Unraveling Blockchain Based Crypto-Currency System Supporting Oblivious Transactions: A Formalized Approach. Proceedings of the ACM Workshop on Blockchain, Cryptocurrencies and Contracts. :23–28.

User privacy is an important issue in a blockchain based transaction system. Bitcoin, being one of the most widely used blockchain based transaction system, fails to provide enough protection on users' privacy. Many subsequent studies focus on establishing a system that hides the linkage between the identities (pseudonyms) of users and the transactions they carry out in order to provide a high level of anonymity. Examples include Zerocoin, Zerocash and so on. It thus becomes an interesting question whether such new transaction systems do provide enough protection on users' privacy. In this paper, we propose a novel and effective approach for de-anonymizing these transaction systems by leveraging information in the system that is not directly related, including the number of transactions made by each identity and time stamp of sending and receiving. Combining probability studies with optimization tools, we establish a model which allows us to determine, among all possible ways of linking between transactions and identities, the one that is most likely to be true. Subsequent transaction graph analysis could then be carried out, leading to the de-anonymization of the system. To solve the model, we provide exact algorithms based on mixed integer linear programming. Our research also establishes interesting relationships between the de-anonymization problem and other problems studied in the literature of theoretical computer science, e.g., the graph matching problem and scheduling problem.

2018-04-02
Al-Zobbi, M., Shahrestani, S., Ruan, C..  2017.  Implementing A Framework for Big Data Anonymity and Analytics Access Control. 2017 IEEE Trustcom/BigDataSE/ICESS. :873–880.

Analytics in big data is maturing and moving towards mass adoption. The emergence of analytics increases the need for innovative tools and methodologies to protect data against privacy violation. Many data anonymization methods were proposed to provide some degree of privacy protection by applying data suppression and other distortion techniques. However, currently available methods suffer from poor scalability, performance and lack of framework standardization. Current anonymization methods are unable to cope with the massive size of data processing. Some of these methods were especially proposed for MapReduce framework to operate in Big Data. However, they still operate in conventional data management approaches. Therefore, there were no remarkable gains in the performance. We introduce a framework that can operate in MapReduce environment to benefit from its advantages, as well as from those in Hadoop ecosystems. Our framework provides a granular user's access that can be tuned to different authorization levels. The proposed solution provides a fine-grained alteration based on the user's authorization level to access MapReduce domain for analytics. Using well-developed role-based access control approaches, this framework is capable of assigning roles to users and map them to relevant data attributes.

2017-04-20
Rohrmann, R., Patton, M. W., Chen, H..  2016.  Anonymous port scanning: Performing network reconnaissance through Tor. 2016 IEEE Conference on Intelligence and Security Informatics (ISI). :217–217.

The anonymizing network Tor is examined as one method of anonymizing port scanning tools and avoiding identification and retaliation. Performing anonymized port scans through Tor is possible using Nmap, but parallelization of the scanning processes is required to accelerate the scan rate.

2015-05-05
Shukla, S., Sadashivappa, G..  2014.  Secure multi-party computation protocol using asymmetric encryption. Computing for Sustainable Global Development (INDIACom), 2014 International Conference on. :780-785.

Privacy preservation is very essential in various real life applications such as medical science and financial analysis. This paper focuses on implementation of an asymmetric secure multi-party computation protocol using anonymization and public-key encryption where all parties have access to trusted third party (TTP) who (1) doesn't add any contribution to computation (2) doesn't know who is the owner of the input received (3) has large number of resources (4) decryption key is known to trusted third party (TTP) to get the actual input for computation of final result. In this environment, concern is to design a protocol which deploys TTP for computation. It is proposed that the protocol is very proficient (in terms of secure computation and individual privacy) for the parties than the other available protocols. The solution incorporates protocol using asymmetric encryption scheme where any party can encrypt a message with the public key but decryption can be done by only the possessor of the decryption key (private key). As the protocol works on asymmetric encryption and packetization it ensures following: (1) Confidentiality (Anonymity) (2) Security (3) Privacy (Data).

Lei Xu, Chunxiao Jiang, Jian Wang, Jian Yuan, Yong Ren.  2014.  Information Security in Big Data: Privacy and Data Mining. Access, IEEE. 2:1149-1176.

The growing popularity and development of data mining technologies bring serious threat to the security of individual,'s sensitive information. An emerging research topic in data mining, known as privacy-preserving data mining (PPDM), has been extensively studied in recent years. The basic idea of PPDM is to modify the data in such a way so as to perform data mining algorithms effectively without compromising the security of sensitive information contained in the data. Current studies of PPDM mainly focus on how to reduce the privacy risk brought by data mining operations, while in fact, unwanted disclosure of sensitive information may also happen in the process of data collecting, data publishing, and information (i.e., the data mining results) delivering. In this paper, we view the privacy issues related to data mining from a wider perspective and investigate various approaches that can help to protect sensitive information. In particular, we identify four different types of users involved in data mining applications, namely, data provider, data collector, data miner, and decision maker. For each type of user, we discuss his privacy concerns and the methods that can be adopted to protect sensitive information. We briefly introduce the basics of related research topics, review state-of-the-art approaches, and present some preliminary thoughts on future research directions. Besides exploring the privacy-preserving approaches for each type of user, we also review the game theoretical approaches, which are proposed for analyzing the interactions among different users in a data mining scenario, each of whom has his own valuation on the sensitive information. By differentiating the responsibilities of different users with respect to security of sensitive information, we would like to provide some useful insights into the study of PPDM.

Gaff, Brian M., Sussman, Heather Egan, Geetter, Jennifer.  2014.  Privacy and Big Data. Computer. 47:7-9.

Big data's explosive growth has prompted the US government to release new reports that address the issues--particularly related to privacy--resulting from this growth. The Web extra at http://youtu.be/j49eoe5g8-c is an audio recording from the Computing and the Law column, in which authors Brian M. Gaff, Heather Egan Sussman, and Jennifer Geetter discuss how big data's explosive growth has prompted the US government to release new reports that address the issues--particularly related to privacy--resulting from this growth.
 

2015-05-04
Dirik, A.E., Sencar, H.T., Memon, N..  2014.  Analysis of Seam-Carving-Based Anonymization of Images Against PRNU Noise Pattern-Based Source Attribution. Information Forensics and Security, IEEE Transactions on. 9:2277-2290.

The availability of sophisticated source attribution techniques raises new concerns about privacy and anonymity of photographers, activists, and human right defenders who need to stay anonymous while spreading their images and videos. Recently, the use of seam-carving, a content-aware resizing method, has been proposed to anonymize the source camera of images against the well-known photoresponse nonuniformity (PRNU)-based source attribution technique. In this paper, we provide an analysis of the seam-carving-based source camera anonymization method by determining the limits of its performance introducing two adversarial models. Our analysis shows that the effectiveness of the deanonymization attacks depend on various factors that include the parameters of the seam-carving method, strength of the PRNU noise pattern of the camera, and an adversary's ability to identify uncarved image blocks in a seam-carved image. Our results show that, for the general case, there should not be many uncarved blocks larger than the size of 50×50 pixels for successful anonymization of the source camera.