Visible to the public Biblio

Found 2348 results

Filters: Keyword is privacy  [Clear All Filters]
2022-04-19
Mu, Jing, Jia, Xia.  2021.  Simulation and Analysis of the Influence of Artificial Interference Signal Style on Wireless Security System Performance. 2021 IEEE 4th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC). 4:2106–2109.
Aimming at the severe security threat faced by information transmission in wireless communication, the artificial interference in physical layer security technology was considered, and the influence of artificial interference signal style on system information transmission security was analyzed by simulation, which provided technical accumulation for the design of wireless security transmission system based on artificial interference.
Evstafyev, G. A., Selyanskaya, E. A..  2021.  Method of Ensuring Structural Secrecy of the Signal. 2021 Systems of Signal Synchronization, Generating and Processing in Telecommunications (SYNCHROINFO. :1–4.
A method for providing energy and structural secrecy of a signal is presented, which is based on the method of pseudo-random restructuring of the spreading sequence. This method complicates the implementation of the accumulation mode, and therefore the detection of the signal-code structure of the signal in a third-party receiver, due to the use of nested pseudo-random sequences (PRS) and their restructuring. And since the receiver-detector is similar to the receiver of the communication system, it is necessary to ensure optimal signal processing to implement an acceptable level of structural secrecy.
2022-04-18
Zhang, Junpeng, Li, Mengqian, Zeng, Shuiguang, Xie, Bin, Zhao, Dongmei.  2021.  A Survey on Security and Privacy Threats to Federated Learning. 2021 International Conference on Networking and Network Applications (NaNA). :319–326.
Federated learning (FL) has nourished a promising scheme to solve the data silo, which enables multiple clients to construct a joint model without centralizing data. The critical concerns for flourishing FL applications are that build a security and privacy-preserving learning environment. It is thus highly necessary to comprehensively identify and classify potential threats to utilize FL under security guarantees. This paper starts from the perspective of launched attacks with different computing participants to construct the unique threats classification, highlighting the significant attacks, e.g., poisoning attacks, inference attacks, and generative adversarial networks (GAN) attacks. Our study shows that existing FL protocols do not always provide sufficient security, containing various attacks from both clients and servers. GAN attacks lead to larger significant threats among the kinds of threats given the invisible of the attack process. Moreover, we summarize a detailed review of several defense mechanisms and approaches to resist privacy risks and security breaches. Then advantages and weaknesses are generalized, respectively. Finally, we conclude the paper to prospect the challenges and some potential research directions.
Miyamae, Takeshi, Kozakura, Fumihiko, Nakamura, Makoto, Zhang, Shenbin, Hua, Song, Pi, Bingfeng, Morinaga, Masanobu.  2021.  ZGridBC: Zero-Knowledge Proof Based Scalable and Private Blockchain Platform for Smart Grid. 2021 IEEE International Conference on Blockchain and Cryptocurrency (ICBC). :1–3.
The total number of photovoltaic power producing facilities whose FIT-based ten-year contract expires by 2023 is expected to reach approximately 1.65 million in Japan. If the number of renewable electricity-producing/consuming facilities reached two million, an enormous number of transactions would be invoked beyond blockchain's scalability.We propose mutually cooperative two novel methods to simultaneously solve scalability, data size, and privacy problems in blockchain-based trading platforms for renewable energy environmental value. One is a management scheme of electricity production resources (EPRs) using an extended UTXO token. The other is a data aggregation scheme that aggregates a significant number of smart meter records with evidentiality using zero-knowledge proof (ZKP).
Bonatti, Piero A., Sauro, Luigi, Langens, Jonathan.  2021.  Representing Consent and Policies for Compliance. 2021 IEEE European Symposium on Security and Privacy Workshops (EuroS PW). :283–291.
Being compliant with the GDPR (and data protection regulations in general) is a difficult task, that calls for manifold, computer-based automated support. In this context, several use cases related to the management and the enforcement of privacy policies and consent call for a machine-understandable policy language, equipped with reliable algorithms for compliance checking and explanations. In this paper, we outline a set of requirements for such languages and algorithms, and address such requirements with a framework based on a profile of OWL2 and a set of policy serializations based on popular formats such as ODRL and JSON. Such ``external'' policy syntax is translated into the ``internal'' OWL2 syntax, thereby enabling semantic compliance checking and explanations using specialized OWL2 reasoners. We provide a precise definition of both the OWL2 profile and the external policy language based on JSON.
2022-04-13
Chen, Hao, Chen, Lin, Kuang, Xiaoyun, Xu, Aidong, Yang, Yiwei.  2021.  Support Forward Secure Smart Grid Data Deduplication and Deletion Mechanism. 2021 2nd Asia Symposium on Signal Processing (ASSP). :67–76.
With the vigorous development of the Internet and the widespread popularity of smart devices, the amount of data it generates has also increased exponentially, which has also promoted the generation and development of cloud computing and big data. Given cloud computing and big data technology, cloud storage has become a good solution for people to store and manage data at this stage. However, when cloud storage manages and regulates massive amounts of data, its security issues have become increasingly prominent. Aiming at a series of security problems caused by a malicious user's illegal operation of cloud storage and the loss of all data, this paper proposes a threshold signature scheme that is signed by a private key composed of multiple users. When this method performs key operations of cloud storage, multiple people are required to sign, which effectively prevents a small number of malicious users from violating data operations. At the same time, the threshold signature method in this paper uses a double update factor algorithm. Even if the attacker obtains the key information at this stage, he can not calculate the complete key information before and after the time period, thus having the two-way security and greatly improving the security of the data in the cloud storage.
Chen, Ping-Xiang, Chen, Shuo-Han, Chang, Yuan-Hao, Liang, Yu-Pei, Shih, Wei-Kuan.  2021.  Facilitating the Efficiency of Secure File Data and Metadata Deletion on SMR-based Ext4 File System. 2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC). :728–733.
The efficiency of secure deletion is highly dependent on the data layout of underlying storage devices. In particular, owing to the sequential-write constraint of the emerging Shingled Magnetic Recording (SMR) technology, an improper data layout could lead to serious write amplification and hinder the performance of secure deletion. The performance degradation of secure deletion on SMR drives is further aggravated with the need to securely erase the file system metadata of deleted files due to the small-size nature of file system metadata. Such an observation motivates us to propose a secure-deletion and SMR-aware space allocation (SSSA) strategy to facilitate the process of securely erasing both the deleted files and their metadata simultaneously. The proposed strategy is integrated within the widely-used extended file system 4 (ext4) and is evaluated through a series of experiments to demonstrate the effectiveness of the proposed strategy. The evaluation results show that the proposed strategy can reduce the secure deletion latency by 91.3% on average when compared with naive SMR-based ext4 file system.
Issifu, Abdul Majeed, Ganiz, Murat Can.  2021.  A Simple Data Augmentation Method to Improve the Performance of Named Entity Recognition Models in Medical Domain. 2021 6th International Conference on Computer Science and Engineering (UBMK). :763–768.
Easy Data Augmentation is originally developed for text classification tasks. It consists of four basic methods: Synonym Replacement, Random Insertion, Random Deletion, and Random Swap. They yield accuracy improvements on several deep neural network models. In this study we apply these methods to a new domain. We augment Named Entity Recognition datasets from medical domain. Although the augmentation task is much more difficult due to the nature of named entities which consist of word or word groups in the sentences, we show that we can improve the named entity recognition performance.
Li, Bingzhe, Du, David.  2021.  WAS-Deletion: Workload-Aware Secure Deletion Scheme for Solid-State Drives. 2021 IEEE 39th International Conference on Computer Design (ICCD). :244–247.
Due to the intrinsic properties of Solid-State Drives (SSDs), invalid data remain in SSDs before erased by a garbage collection process, which increases the risk of being attacked by adversaries. Previous studies use erase and cryptography based schemes to purposely delete target data but face extremely large overhead. In this paper, we propose a Workload-Aware Secure Deletion scheme, called WAS-Deletion, to reduce the overhead of secure deletion by three major components. First, the WAS-Deletion scheme efficiently splits invalid and valid data into different blocks based on workload characteristics. Second, the WAS-Deletion scheme uses a new encryption allocation scheme, making the encryption follow the same direction as the write on multiple blocks and vertically encrypts pages with the same key in one block. Finally, a new adaptive scheduling scheme can dynamically change the configurations of different regions to further reduce secure deletion overhead based on the current workload. The experimental results indicate that the newly proposed WAS-Deletion scheme can reduce the secure deletion cost by about 1.2x to 12.9x compared to previous studies.
Liu, Ling, Zhang, Shengli, Ling, Cong.  2021.  Set Reconciliation for Blockchains with Slepian-Wolf Coding: Deletion Polar Codes. 2021 13th International Conference on Wireless Communications and Signal Processing (WCSP). :1–5.
In this paper, we propose a polar coding based scheme for set reconciliation between two network nodes. The system is modeled as a well-known Slepian-Wolf setting induced by a fixed number of deletions. The set reconciliation process is divided into two phases: 1) a deletion polar code is employed to help one node to identify the possible deletion indices, which may be larger than the number of genuine deletions; 2) a lossless compression polar code is then designed to feedback those indices with minimum overhead. Our scheme can be viewed as a generalization of polar codes to some emerging network-based applications such as the package synchronization in blockchains. The total overhead is linear to the number of packages, and immune to the package size.
Silva, Wagner, Garcia, Ana Cristina Bicharra.  2021.  Where is our data? A Blockchain-based Information Chain of Custody Model for Privacy Improvement 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD). :329–334.
The advancement of Information and Communication Technologies has brought numerous facilities and benefits to society. In this environment, surrounded by technologies, data, and personal information, have become an essential and coveted tool for many sectors. In this scenario, where a large amount of data has been collected, stored, and shared, privacy concerns arise, especially when dealing with sensitive data such as health data. The information owner generally has no control over his information, which can bring serious consequences such as increases in health insurance prices or put the individual in an uncomfortable situation with disclosing his physical or mental health. While privacy regulations, like the General Data Protection Regulation (GDPR), make it clear that the information owner must have full control and management over their data, disparities have been observed in most systems and platforms. Therefore, they are often not able to give consent or have control and management over their data. For the users to exercise their right to privacy and have sufficient control over their data, they must know everything that happens to them, where their data is, and where they have been. It is necessary that the entire life cycle, from generation to deletion of data, is managed by its owner. To this end, this article presents an Information Chain of Custody Model based on Blockchain technology, which allows from the traceability of information to the offer of tools that will enable the effective management of data, offering total control to its owner. The result showed that the prototype was very useful in the traceability of the information. With that it became clear the technical feasibility of this research.
Sun, He, Liu, Rongke, Tian, Kuangda, Zou, Tong, Feng, Baoping.  2021.  Deletion Error Correction based on Polar Codes in Skyrmion Racetrack Memory. 2021 IEEE Wireless Communications and Networking Conference (WCNC). :1–6.
Skyrmion racetrack memory (Sk-RM) is a new storage technology in which skyrmions are used to represent data bits to provide high storage density. During the reading procedure, the skyrmion is driven by a current and sensed by a fixed read head. However, synchronization errors may happen if the skyrmion does not pass the read head on time. In this paper, a polar coding scheme is proposed to correct the synchronization errors in the Sk-RM. Firstly, we build two error correction models for the reading operation of Sk-RM. By connecting polar codes with the marker codes, the number of deletion errors can be determined. We also redesign the decoding algorithm to recover the information bits from the readout sequence, where a tighter bound of the segmented deletion errors is derived and a novel parity check strategy is designed for better decoding performance. Simulation results show that the proposed coding scheme can efficiently improve the decoding performance.
Ahmad Riduan, Nuraqilah Haidah, Feresa Mohd Foozy, Cik, Hamid, Isredza Rahmi A, Shamala, Palaniappan, Othman, Nur Fadzilah.  2021.  Data Wiping Tool: ByteEditor Technique. 2021 3rd International Cyber Resilience Conference (CRC). :1–6.
This Wiping Tool is an anti-forensic tool that is built to wipe data permanently from laptop's storage. This tool is capable to ensure the data from being recovered with any recovery tools. The objective of building this wiping tool is to maintain the confidentiality and integrity of the data from unauthorized access. People tend to delete the file in normal way, however, the file face the risk of being recovered. Hence, the integrity and confidentiality of the deleted file cannot be protected. Through wiping tools, the files are overwritten with random strings to make the files no longer readable. Thus, the integrity and the confidentiality of the file can be protected. Regarding wiping tools, nowadays, lots of wiping tools face issue such as data breach because the wiping tools are unable to delete the data permanently from the devices. This situation might affect their main function and a threat to their users. Hence, a new wiping tool is developed to overcome the problem. A new wiping tool named Data Wiping tool is applying two wiping techniques. The first technique is Randomized Data while the next one is enhancing wiping technique, known as ByteEditor. ByteEditor is a combination of two different techniques, byte editing and byte deletion. With the implementation of Object-Oriented methodology, this wiping tool is built. This methodology consists of analyzing, designing, implementation and testing. The tool is analyzed and compared with other wiping tools before the designing of the tool start. Once the designing is done, implementation phase take place. The code of the tool is created using Visual Studio 2010 with C\# language and being tested their functionality to ensure the developed tool meet the objectives of the project. This tool is believed able to contribute to the development of wiping tools and able to solve problems related to other wiping tools.
Godin, Jonathan, Lamontagne, Philippe.  2021.  Deletion-Compliance in the Absence of Privacy. 2021 18th International Conference on Privacy, Security and Trust (PST). :1–10.
Garg, Goldwasser and Vasudevan (Eurocrypt 2020) invented the notion of deletion-compliance to formally model the “right to be forgotten’, a concept that confers individuals more control over their digital data. A requirement of deletion-compliance is strong privacy for the deletion requesters since no outside observer must be able to tell if deleted data was ever present in the first place. Naturally, many real world systems where information can flow across users are automatically ruled out.The main thesis of this paper is that deletion-compliance is a standalone notion, distinct from privacy. We present an alternative definition that meaningfully captures deletion-compliance without any privacy implications. This allows broader class of data collectors to demonstrate compliance to deletion requests and to be paired with various notions of privacy. Our new definition has several appealing properties:•It is implied by the stronger definition of Garg et al. under natural conditions, and is equivalent when we add a strong privacy requirement.•It is naturally composable with minimal assumptions.•Its requirements are met by data structure implementations that do not reveal the order of operations, a concept known as history-independence.Along the way, we discuss the many challenges that remain in providing a universal definition of compliance to the “right to be forgotten.”
Wang, Chengyan, Li, Yuling, Zhang, Yong.  2021.  Hybrid Data Fast Distribution Algorithm for Wireless Sensor Networks in Visual Internet of Things. 2021 International Conference on Big Data Analysis and Computer Science (BDACS). :166–169.
With the maturity of Internet of things technology, massive data transmission has become the focus of research. In order to solve the problem of low speed of traditional hybrid data fast distribution algorithm for wireless sensor networks, a hybrid data fast distribution algorithm for wireless sensor networks based on visual Internet of things is designed. The logic structure of mixed data input gate in wireless sensor network is designed through the visual Internet of things. The objective function of fast distribution of mixed data in wireless sensor network is proposed. The number of copies of data to be distributed is dynamically calculated and the message deletion strategy is determined. Then the distribution parameters are calibrated, and the fitness ranking is performed according to the distribution quantity to complete the algorithm design. The experimental results show that the distribution rate of the designed algorithm is significantly higher than that of the control group, which can solve the problem of low speed of traditional data fast distribution algorithm.
Solanke, Abiodun A., Chen, Xihui, Ramírez-Cruz, Yunior.  2021.  Pattern Recognition and Reconstruction: Detecting Malicious Deletions in Textual Communications. 2021 IEEE International Conference on Big Data (Big Data). :2574–2582.
Digital forensic artifacts aim to provide evidence from digital sources for attributing blame to suspects, assessing their intents, corroborating their statements or alibis, etc. Textual data is a significant source of artifacts, which can take various forms, for instance in the form of communications. E-mails, memos, tweets, and text messages are all examples of textual communications. Complex statistical, linguistic and other scientific procedures can be manually applied to this data to uncover significant clues that point the way to factual information. While expert investigators can undertake this task, there is a possibility that critical information is missed or overlooked. The primary objective of this work is to aid investigators by partially automating the detection of suspicious e-mail deletions. Our approach consists in building a dynamic graph to represent the temporal evolution of communications, and then using a Variational Graph Autoencoder to detect possible e-mail deletions in this graph. Our model uses multiple types of features for representing node and edge attributes, some of which are based on metadata of the messages and the rest are extracted from the contents using natural language processing and text mining techniques. We use the autoencoder to detect missing edges, which we interpret as potential deletions; and to reconstruct their features, from which we emit hypotheses about the topics of deleted messages. We conducted an empirical evaluation of our model on the Enron e-mail dataset, which shows that our model is able to accurately detect a significant proportion of missing communications and to reconstruct the corresponding topic vectors.
2022-04-12
Redini, Nilo, Continella, Andrea, Das, Dipanjan, De Pasquale, Giulio, Spahn, Noah, Machiry, Aravind, Bianchi, Antonio, Kruegel, Christopher, Vigna, Giovanni.  2021.  Diane: Identifying Fuzzing Triggers in Apps to Generate Under-constrained Inputs for IoT Devices. 2021 IEEE Symposium on Security and Privacy (SP). :484—500.
Internet of Things (IoT) devices have rooted themselves in the everyday life of billions of people. Thus, researchers have applied automated bug finding techniques to improve their overall security. However, due to the difficulties in extracting and emulating custom firmware, black-box fuzzing is often the only viable analysis option. Unfortunately, this solution mostly produces invalid inputs, which are quickly discarded by the targeted IoT device and do not penetrate its code. Another proposed approach is to leverage the companion app (i.e., the mobile app typically used to control an IoT device) to generate well-structured fuzzing inputs. Unfortunately, the existing solutions produce fuzzing inputs that are constrained by app-side validation code, thus significantly limiting the range of discovered vulnerabilities.In this paper, we propose a novel approach that overcomes these limitations. Our key observation is that there exist functions inside the companion app that can be used to generate optimal (i.e., valid yet under-constrained) fuzzing inputs. Such functions, which we call fuzzing triggers, are executed before any data-transforming functions (e.g., network serialization), but after the input validation code. Consequently, they generate inputs that are not constrained by app-side sanitization code, and, at the same time, are not discarded by the analyzed IoT device due to their invalid format. We design and develop Diane, a tool that combines static and dynamic analysis to find fuzzing triggers in Android companion apps, and then uses them to fuzz IoT devices automatically. We use Diane to analyze 11 popular IoT devices, and identify 11 bugs, 9 of which are zero days. Our results also show that without using fuzzing triggers, it is not possible to generate bug-triggering inputs for many devices.
Rane, Prachi, Rao, Aishwarya, Verma, Diksha, Mhaisgawali, Amrapali.  2021.  Redacting Sensitive Information from the Data. 2021 International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON). :1—5.
Redaction of personal, confidential and sensitive information from documents is becoming increasingly important for individuals and organizations. In past years, there have been many well-publicized cases of data leaks from various popular companies. When the data contains sensitive information, these leaks pose a serious threat. To protect and conceal sensitive information, many companies have policies and laws about processing and sanitizing sensitive information in business documents.The traditional approach of manually finding and matching millions of words and then redacting is slow and error-prone. This paper examines different models to automate the identification and redaction of personal and sensitive information contained within the documents using named entity recognition. Sensitive entities example person’s name, bank account details or Aadhaar numbers targeted for redaction, are recognized based on the file’s content, providing users with an interactive approach to redact the documents by changing selected sensitive terms.
Guo, Yifan, Wang, Qianlong, Ji, Tianxi, Wang, Xufei, Li, Pan.  2021.  Resisting Distributed Backdoor Attacks in Federated Learning: A Dynamic Norm Clipping Approach. 2021 IEEE International Conference on Big Data (Big Data). :1172—1182.
With the advance in artificial intelligence and high-dimensional data analysis, federated learning (FL) has emerged to allow distributed data providers to collaboratively learn without direct access to local sensitive data. However, limiting access to individual provider’s data inevitably incurs security issues. For instance, backdoor attacks, one of the most popular data poisoning attacks in FL, severely threaten the integrity and utility of the FL system. In particular, backdoor attacks launched by multiple collusive attackers, i.e., distributed backdoor attacks, can achieve high attack success rates and are hard to detect. Existing defensive approaches, like model inspection or model sanitization, often require to access a portion of local training data, which renders them inapplicable to the FL scenarios. Recently, the norm clipping approach is developed to effectively defend against distributed backdoor attacks in FL, which does not rely on local training data. However, we discover that adversaries can still bypass this defense scheme through robust training due to its unchanged norm clipping threshold. In this paper, we propose a novel defense scheme to resist distributed backdoor attacks in FL. Particularly, we first identify that the main reason for the failure of the norm clipping scheme is its fixed threshold in the training process, which cannot capture the dynamic nature of benign local updates during the global model’s convergence. Motivated by it, we devise a novel defense mechanism to dynamically adjust the norm clipping threshold of local updates. Moreover, we provide the convergence analysis of our defense scheme. By evaluating it on four non-IID public datasets, we observe that our defense scheme effectively can resist distributed backdoor attacks and ensure the global model’s convergence. Noticeably, our scheme reduces the attack success rates by 84.23% on average compared with existing defense schemes.
Kalai Chelvi, T., Ramapraba, P. S., Sathya Priya, M., Vimala, S., Shobarani, R., Jeshwanth, N L, Babisha, A..  2021.  A Web Application for Prevention of Inference Attacks using Crowd Sourcing in Social Networks. 2021 2nd International Conference on Smart Electronics and Communication (ICOSEC). :328—332.
Many people are becoming more reliant on internet social media sites like Facebook. Users can utilize these networks to reveal articles to them and engage with your peers. Several of the data transmitted from these connections is intended to be confidential. However, utilizing publicly available data and learning algorithms, it is feasible to forecast concealed informative data. The proposed research work investigates the different ways to initiate deduction attempts on freely released photo sharing data in order to envisage concealed informative data. Next, this research study offers three distinct sanitization procedures that could be used in a range of scenarios. Moreover, the effectualness of all these strategies and endeavor to utilize collective teaching and research to reveal important bits of the data set are analyzed. It shows how, by using the sanitization methods presented here, a user may lower the accuracy by including both global and interpersonal categorization techniques.
Dutta, Arjun, Chaki, Koustav, Sen, Ayushman, Kumar, Ashutosh, Chakrabarty, Ratna.  2021.  IoT based Sanitization Tunnel. 2021 5th International Conference on Electronics, Materials Engineering Nano-Technology (IEMENTech). :1—5.
The Covid-19 Pandemic has caused huge losses worldwide and is still affecting people all around the world. Even after rigorous, incessant and dedicated efforts from people all around the world, it keeps mutating and spreading at an alarming rate. In times such as these, it is extremely important to take proper precautionary measures to stay safe and help to contain the spread of the virus. In this paper, we propose an innovative design of one such commonly used public disinfection method, an Automatic Walkthrough Sanitization Tunnel. It is a walkthrough sanitization tunnel which uses sensors to detect the target and automatically disinfects it followed by irradiation using UV-C rays for extra protection. There is a proposition to add an IoT based Temperature sensor and data relay module used to detect the temperature of any person entering the tunnel and in case of any anomaly, contact nearby covid wards to facilitate rapid treatment.
Venkatesan, Sridhar, Sikka, Harshvardhan, Izmailov, Rauf, Chadha, Ritu, Oprea, Alina, de Lucia, Michael J..  2021.  Poisoning Attacks and Data Sanitization Mitigations for Machine Learning Models in Network Intrusion Detection Systems. MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM). :874—879.
Among many application domains of machine learning in real-world settings, cyber security can benefit from more automated techniques to combat sophisticated adversaries. Modern network intrusion detection systems leverage machine learning models on network logs to proactively detect cyber attacks. However, the risk of adversarial attacks against machine learning used in these cyber settings is not fully explored. In this paper, we investigate poisoning attacks at training time against machine learning models in constrained cyber environments such as network intrusion detection; we also explore mitigations of such attacks based on training data sanitization. We consider the setting of poisoning availability attacks, in which an attacker can insert a set of poisoned samples at training time with the goal of degrading the accuracy of the deployed model. We design a white-box, realizable poisoning attack that reduced the original model accuracy from 95% to less than 50 % by generating mislabeled samples in close vicinity of a selected subset of training points. We also propose a novel Nested Training method as a defense against these attacks. Our defense includes a diversified ensemble of classifiers, each trained on a different subset of the training set. We use the disagreement of the classifiers' predictions as a data sanitization method, and show that an ensemble of 10 SVM classifiers is resilient to a large fraction of poisoning samples, up to 30% of the training data.
Chen, Huiping, Dong, Changyu, Fan, Liyue, Loukides, Grigorios, Pissis, Solon P., Stougie, Leen.  2021.  Differentially Private String Sanitization for Frequency-Based Mining Tasks. 2021 IEEE International Conference on Data Mining (ICDM). :41—50.
Strings are used to model genomic, natural language, and web activity data, and are thus often shared broadly. However, string data sharing has raised privacy concerns stemming from the fact that knowledge of length-k substrings of a string and their frequencies (multiplicities) may be sufficient to uniquely reconstruct the string; and from that the inference of such substrings may leak confidential information. We thus introduce the problem of protecting length-k substrings of a single string S by applying Differential Privacy (DP) while maximizing data utility for frequency-based mining tasks. Our theoretical and empirical evidence suggests that classic DP mechanisms are not suitable to address the problem. In response, we employ the order-k de Bruijn graph G of S and propose a sampling-based mechanism for enforcing DP on G. We consider the task of enforcing DP on G using our mechanism while preserving the normalized edge multiplicities in G. We define an optimization problem on integer edge weights that is central to this task and develop an algorithm based on dynamic programming to solve it exactly. We also consider two variants of this problem with real edge weights. By relaxing the constraint of integer edge weights, we are able to develop linear-time exact algorithms for these variants, which we use as stepping stones towards effective heuristics. An extensive experimental evaluation using real-world large-scale strings (in the order of billions of letters) shows that our heuristics are efficient and produce near-optimal solutions which preserve data utility for frequency-based mining tasks.
Lavi, Bahram, Nascimento, José, Rocha, Anderson.  2021.  Semi-Supervised Feature Embedding for Data Sanitization in Real-World Events. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :2495—2499.
With the rapid growth of data sharing through social media networks, determining relevant data items concerning a particular subject becomes paramount. We address the issue of establishing which images represent an event of interest through a semi-supervised learning technique. The method learns consistent and shared features related to an event (from a small set of examples) to propagate them to an unlabeled set. We investigate the behavior of five image feature representations considering low- and high-level features and their combinations. We evaluate the effectiveness of the feature embedding approach on five collected datasets from real-world events.
Duth, Akshay, Nambiar, Abhinav A, Teja, Chintha Bhanu, Yadav, Sudha.  2021.  Smart Door System with COVID-19 Risk Factor Evaluation, Contactless Data Acquisition and Sanitization. 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS). :1504—1511.
Thousands of people have lost their life by COVID-19 infection. Authorities have seen the calamities caused by the corona virus in China. So, when the trace of virus was found in India, the only possible way to stop the spread of the virus was to go into lockdown. In a country like India where a major part of the population depends on the daily wages, being in lockdown started affecting their life. People where tend to go out for getting the food items and other essentials, and this caused the spread of virus. Many were infected and many lost their life by this. Due to the pandemic, the whole world was affected and many people working in foreign countries lost their jobs as well. These people who came back to India caused further spread of the virus. The main reason for the spread is lack of hygiene and a proper system to monitor the symptoms. Even though our country was in lockdown for almost 6 months the number of COVID cases doesn't get diminished. It is not practical to extend the lockdown any further, and people have decided to live with the virus. But it is essential to take the necessary precautions while interacting with the society. Automated system for checking that all the COVID protocols are followed and early symptom identification before entering to a place are essential to stop the spread of the infection. This research work proposes a smart door system, which evaluates the COVID-19 risk factors and collects the data of person before entering into any place, thereby ensuring that non-infected people are only entering to the place and thus the spread of virus can be avoided.