Visible to the public Biblio

Filters: Keyword is data anonymization  [Clear All Filters]
2021-01-28
Santos, W., Sousa, G., Prata, P., Ferrão, M. E..  2020.  Data Anonymization: K-anonymity Sensitivity Analysis. 2020 15th Iberian Conference on Information Systems and Technologies (CISTI). :1—6.

These days the digitization process is everywhere, spreading also across central governments and local authorities. It is hoped that, using open government data for scientific research purposes, the public good and social justice might be enhanced. Taking into account the European General Data Protection Regulation recently adopted, the big challenge in Portugal and other European countries, is how to provide the right balance between personal data privacy and data value for research. This work presents a sensitivity study of data anonymization procedure applied to a real open government data available from the Brazilian higher education evaluation system. The ARX k-anonymization algorithm, with and without generalization of some research value variables, was performed. The analysis of the amount of data / information lost and the risk of re-identification suggest that the anonymization process may lead to the under-representation of minorities and sociodemographic disadvantaged groups. It will enable scientists to improve the balance among risk, data usability, and contributions for the public good policies and practices.

2020-11-20
Wang, X., Herwono, I., Cerbo, F. D., Kearney, P., Shackleton, M..  2018.  Enabling Cyber Security Data Sharing for Large-scale Enterprises Using Managed Security Services. 2018 IEEE Conference on Communications and Network Security (CNS). :1—7.
Large enterprises and organizations from both private and public sectors typically outsource a platform solution, as part of the Managed Security Services (MSSs), from 3rd party providers (MSSPs) to monitor and analyze their data containing cyber security information. Sharing such data among these large entities is believed to improve their effectiveness and efficiency at tackling cybercrimes, via improved analytics and insights. However, MSS platform customers currently are not able or not willing to share data among themselves because of multiple reasons, including privacy and confidentiality concerns, even when they are using the same MSS platform. Therefore any proposed mechanism or technique to address such a challenge need to ensure that sharing is achieved in a secure and controlled way. In this paper, we propose a new architecture and use case driven designs to enable confidential, flexible and collaborative data sharing among such organizations using the same MSS platform. MSS platform is a complex environment where different stakeholders, including authorized MSSP personnel and customers' own users, have access to the same platform but with different types of rights and tasks. Hence we make every effort to improve the usability of the platform supporting sharing while keeping the existing rights and tasks intact. As an innovative and pioneering attempt to address the challenge of data sharing in the MSS platform, we hope to encourage further work to follow so that confidential and collaborative sharing eventually happens among MSS platform customers.
2020-01-20
Oqaily, Momen, Jarraya, Yosr, Mohammady, Meisam, Majumdar, Suryadipta, Pourzandi, Makan, Wang, Lingyu, Debbabi, Mourad.  2019.  SegGuard: Segmentation-based Anonymization of Network Data in Clouds for Privacy-Preserving Security Auditing. IEEE Transactions on Dependable and Secure Computing. :1–1.
Security auditing allows cloud tenants to verify the compliance of cloud infrastructure with respect to desirable security properties, e.g., whether a tenant's virtual network is properly isolated from other tenants' networks. However, the input to such an auditing task, such as the detailed topology of the underlying cloud infrastructure, typically contains sensitive information which a cloud provider may be reluctant to hand over to a third party auditor. Additionally, auditing results intended for one tenant may inadvertently reveal private information about other tenants, e.g., another tenant's VM is reachable due to a misconfiguration. How to anonymize both the input data and the auditing results in order to prevent such information leakage is a novel challenge that has received little attention. Directly applying most of the existing anonymization techniques to such a context would either lead to insufficient protection or render the data unsuitable for auditing. In this paper, we propose SegGuard, a novel anonymization approach that prevents cross-tenant information leakage through per-tenant encryption, and prevents information leakage to auditors through hiding real input segments among fake ones; in addition, applying property-preserving encryption in an innovative way enables SegGuard to preserve the data utility for auditing while mitigating semantic attacks. We implement SegGuard based on OpenStack, and evaluate its effectiveness and overhead using both synthetic and real data. Our experimental results demonstrate that SegGuard can reduce the information leakage to a negligible level (e.g., less than 1% for an adversary with 50% pre-knowledge) with a practical response time (e.g., 62 seconds to anonymize a cloud infrastructure with 25,000 virtual machines).
2019-12-16
Sun, Lin, Zhang, Lan, Ye, Xiaojun.  2018.  Randomized Bit Vector: Privacy-Preserving Encoding Mechanism. Proceedings of the 27th ACM International Conference on Information and Knowledge Management. :1263–1272.
Recently, many methods have been proposed to prevent privacy leakage in record linkage by encoding record pair data into another anonymous space. Nevertheless, they cannot perform well in some circumstances due to high computational complexities, low privacy guarantees or loss of data utility. In this paper, we propose distance-aware encoding mechanisms to compare numerical values in the anonymous space. We first embed numerical values into Hamming space by a low-computational encoding algorithm with randomized bit vector. To provide rigorous privacy guarantees, we use the random response based on differential privacy to keep global indistinguishability of original data and use Laplace noises via pufferfish mechanism to provide local indistinguishability. Besides, we provide an approach for embedding and privacy-related parameters selection to improve data utility. Experiments on datasets from different data distributions and application contexts validate that our approaches can be used efficiently in privacy-preserving record linkage tasks compared with previous works and have excellent performance even under very small privacy budgets.
2019-11-04
Gunawan, Dedi, Mambo, Masahiro.  2018.  Set-valued Data Anonymization Maintaining Data Utility and Data Property. Proceedings of the 12th International Conference on Ubiquitous Information Management and Communication. :88:1–88:8.

Set-valued database publication has been attracting much attention due to its benefit for various applications like recommendation systems and marketing analysis. However, publishing original database directly is risky since an unauthorized party may violate individual privacy by associating and analyzing relations between individuals and set of items in the published database, which is known as identity linkage attack. Generally, an attack is performed based on attacker's background knowledge obtained by a prior investigation and such adversary knowledge should be taken into account in the data anonymization. Various data anonymization schemes have been proposed to prevent the identity linkage attack. However, in existing data anonymization schemes, either data utility or data property is reduced a lot after excessive database modification and consequently data recipients become to distrust the released database. In this paper, we propose a new data anonymization scheme, called sibling suppression, which causes minimum data utility lost and maintains data properties like database size and the number of records. The scheme uses multiple sets of adversary knowledge and items in a category of adversary knowledge are replaced by other items in the category. Several experiments with real dataset show that our method can preserve data utility with minimum lost and maintain data property as the same as original database.