Visible to the public Biblio

Filters: Keyword is anonymized data  [Clear All Filters]
2021-01-28
Zhang, M., Wei, T., Li, Z., Zhou, Z..  2020.  A service-oriented adaptive anonymity algorithm. 2020 39th Chinese Control Conference (CCC). :7626—7631.

Recently, a large amount of research studies aiming at the privacy-preserving data publishing have been conducted. We find that most K-anonymity algorithms fail to consider the characteristics of attribute values distribution in data and the contribution value differences in quasi-identifier attributes when service-oriented. In this paper, the importance of distribution characteristics of attribute values and the differences in contribution value of quasi-identifier attributes to anonymous results are illustrated. In order to maximize the utility of released data, a service-oriented adaptive anonymity algorithm is proposed. We establish a model of reaction dispersion degree to quantify the characteristics of attribute value distribution and introduce the concept of utility weight related to the contribution value of quasi-identifier attributes. The priority coefficient and the characterization coefficient of partition quality are defined to optimize selection strategies of dimension and splitting value in anonymity group partition process adaptively, which can reduce unnecessary information loss so as to further improve the utility of anonymized data. The rationality and validity of the algorithm are verified by theoretical analysis and multiple experiments.

2020-08-13
Augusto, Cristian, Morán, Jesús, De La Riva, Claudio, Tuya, Javier.  2019.  Test-Driven Anonymization for Artificial Intelligence. 2019 IEEE International Conference On Artificial Intelligence Testing (AITest). :103—110.
In recent years, data published and shared with third parties to develop artificial intelligence (AI) tools and services has significantly increased. When there are regulatory or internal requirements regarding privacy of data, anonymization techniques are used to maintain privacy by transforming the data. The side-effect is that the anonymization may lead to useless data to train and test the AI because it is highly dependent on the quality of the data. To overcome this problem, we propose a test-driven anonymization approach for artificial intelligence tools. The approach tests different anonymization efforts to achieve a trade-off in terms of privacy (non-functional quality) and functional suitability of the artificial intelligence technique (functional quality). The approach has been validated by means of two real-life datasets in the domains of healthcare and health insurance. Each of these datasets is anonymized with several privacy protections and then used to train classification AIs. The results show how we can anonymize the data to achieve an adequate functional suitability in the AI context while maintaining the privacy of the anonymized data as high as possible.
2020-04-20
Lim, Yeon-sup, Srivatsa, Mudhakar, Chakraborty, Supriyo, Taylor, Ian.  2018.  Learning Light-Weight Edge-Deployable Privacy Models. 2018 IEEE International Conference on Big Data (Big Data). :1290–1295.
Privacy becomes one of the important issues in data-driven applications. The advent of non-PC devices such as Internet-of-Things (IoT) devices for data-driven applications leads to needs for light-weight data anonymization. In this paper, we develop an anonymization framework that expedites model learning in parallel and generates deployable models for devices with low computing capability. We evaluate our framework with various settings such as different data schema and characteristics. Our results exhibit that our framework learns anonymization models up to 16 times faster than a sequential anonymization approach and that it preserves enough information in anonymized data for data-driven applications.