Redini, Nilo, Continella, Andrea, Das, Dipanjan, De Pasquale, Giulio, Spahn, Noah, Machiry, Aravind, Bianchi, Antonio, Kruegel, Christopher, Vigna, Giovanni.
2021.
Diane: Identifying Fuzzing Triggers in Apps to Generate Under-constrained Inputs for IoT Devices. 2021 IEEE Symposium on Security and Privacy (SP). :484—500.
Internet of Things (IoT) devices have rooted themselves in the everyday life of billions of people. Thus, researchers have applied automated bug finding techniques to improve their overall security. However, due to the difficulties in extracting and emulating custom firmware, black-box fuzzing is often the only viable analysis option. Unfortunately, this solution mostly produces invalid inputs, which are quickly discarded by the targeted IoT device and do not penetrate its code. Another proposed approach is to leverage the companion app (i.e., the mobile app typically used to control an IoT device) to generate well-structured fuzzing inputs. Unfortunately, the existing solutions produce fuzzing inputs that are constrained by app-side validation code, thus significantly limiting the range of discovered vulnerabilities.In this paper, we propose a novel approach that overcomes these limitations. Our key observation is that there exist functions inside the companion app that can be used to generate optimal (i.e., valid yet under-constrained) fuzzing inputs. Such functions, which we call fuzzing triggers, are executed before any data-transforming functions (e.g., network serialization), but after the input validation code. Consequently, they generate inputs that are not constrained by app-side sanitization code, and, at the same time, are not discarded by the analyzed IoT device due to their invalid format. We design and develop Diane, a tool that combines static and dynamic analysis to find fuzzing triggers in Android companion apps, and then uses them to fuzz IoT devices automatically. We use Diane to analyze 11 popular IoT devices, and identify 11 bugs, 9 of which are zero days. Our results also show that without using fuzzing triggers, it is not possible to generate bug-triggering inputs for many devices.
Rane, Prachi, Rao, Aishwarya, Verma, Diksha, Mhaisgawali, Amrapali.
2021.
Redacting Sensitive Information from the Data. 2021 International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON). :1—5.
Redaction of personal, confidential and sensitive information from documents is becoming increasingly important for individuals and organizations. In past years, there have been many well-publicized cases of data leaks from various popular companies. When the data contains sensitive information, these leaks pose a serious threat. To protect and conceal sensitive information, many companies have policies and laws about processing and sanitizing sensitive information in business documents.The traditional approach of manually finding and matching millions of words and then redacting is slow and error-prone. This paper examines different models to automate the identification and redaction of personal and sensitive information contained within the documents using named entity recognition. Sensitive entities example person’s name, bank account details or Aadhaar numbers targeted for redaction, are recognized based on the file’s content, providing users with an interactive approach to redact the documents by changing selected sensitive terms.
Guo, Yifan, Wang, Qianlong, Ji, Tianxi, Wang, Xufei, Li, Pan.
2021.
Resisting Distributed Backdoor Attacks in Federated Learning: A Dynamic Norm Clipping Approach. 2021 IEEE International Conference on Big Data (Big Data). :1172—1182.
With the advance in artificial intelligence and high-dimensional data analysis, federated learning (FL) has emerged to allow distributed data providers to collaboratively learn without direct access to local sensitive data. However, limiting access to individual provider’s data inevitably incurs security issues. For instance, backdoor attacks, one of the most popular data poisoning attacks in FL, severely threaten the integrity and utility of the FL system. In particular, backdoor attacks launched by multiple collusive attackers, i.e., distributed backdoor attacks, can achieve high attack success rates and are hard to detect. Existing defensive approaches, like model inspection or model sanitization, often require to access a portion of local training data, which renders them inapplicable to the FL scenarios. Recently, the norm clipping approach is developed to effectively defend against distributed backdoor attacks in FL, which does not rely on local training data. However, we discover that adversaries can still bypass this defense scheme through robust training due to its unchanged norm clipping threshold. In this paper, we propose a novel defense scheme to resist distributed backdoor attacks in FL. Particularly, we first identify that the main reason for the failure of the norm clipping scheme is its fixed threshold in the training process, which cannot capture the dynamic nature of benign local updates during the global model’s convergence. Motivated by it, we devise a novel defense mechanism to dynamically adjust the norm clipping threshold of local updates. Moreover, we provide the convergence analysis of our defense scheme. By evaluating it on four non-IID public datasets, we observe that our defense scheme effectively can resist distributed backdoor attacks and ensure the global model’s convergence. Noticeably, our scheme reduces the attack success rates by 84.23% on average compared with existing defense schemes.
Kalai Chelvi, T., Ramapraba, P. S., Sathya Priya, M., Vimala, S., Shobarani, R., Jeshwanth, N L, Babisha, A..
2021.
A Web Application for Prevention of Inference Attacks using Crowd Sourcing in Social Networks. 2021 2nd International Conference on Smart Electronics and Communication (ICOSEC). :328—332.
Many people are becoming more reliant on internet social media sites like Facebook. Users can utilize these networks to reveal articles to them and engage with your peers. Several of the data transmitted from these connections is intended to be confidential. However, utilizing publicly available data and learning algorithms, it is feasible to forecast concealed informative data. The proposed research work investigates the different ways to initiate deduction attempts on freely released photo sharing data in order to envisage concealed informative data. Next, this research study offers three distinct sanitization procedures that could be used in a range of scenarios. Moreover, the effectualness of all these strategies and endeavor to utilize collective teaching and research to reveal important bits of the data set are analyzed. It shows how, by using the sanitization methods presented here, a user may lower the accuracy by including both global and interpersonal categorization techniques.
Dutta, Arjun, Chaki, Koustav, Sen, Ayushman, Kumar, Ashutosh, Chakrabarty, Ratna.
2021.
IoT based Sanitization Tunnel. 2021 5th International Conference on Electronics, Materials Engineering Nano-Technology (IEMENTech). :1—5.
The Covid-19 Pandemic has caused huge losses worldwide and is still affecting people all around the world. Even after rigorous, incessant and dedicated efforts from people all around the world, it keeps mutating and spreading at an alarming rate. In times such as these, it is extremely important to take proper precautionary measures to stay safe and help to contain the spread of the virus. In this paper, we propose an innovative design of one such commonly used public disinfection method, an Automatic Walkthrough Sanitization Tunnel. It is a walkthrough sanitization tunnel which uses sensors to detect the target and automatically disinfects it followed by irradiation using UV-C rays for extra protection. There is a proposition to add an IoT based Temperature sensor and data relay module used to detect the temperature of any person entering the tunnel and in case of any anomaly, contact nearby covid wards to facilitate rapid treatment.
Venkatesan, Sridhar, Sikka, Harshvardhan, Izmailov, Rauf, Chadha, Ritu, Oprea, Alina, de Lucia, Michael J..
2021.
Poisoning Attacks and Data Sanitization Mitigations for Machine Learning Models in Network Intrusion Detection Systems. MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM). :874—879.
Among many application domains of machine learning in real-world settings, cyber security can benefit from more automated techniques to combat sophisticated adversaries. Modern network intrusion detection systems leverage machine learning models on network logs to proactively detect cyber attacks. However, the risk of adversarial attacks against machine learning used in these cyber settings is not fully explored. In this paper, we investigate poisoning attacks at training time against machine learning models in constrained cyber environments such as network intrusion detection; we also explore mitigations of such attacks based on training data sanitization. We consider the setting of poisoning availability attacks, in which an attacker can insert a set of poisoned samples at training time with the goal of degrading the accuracy of the deployed model. We design a white-box, realizable poisoning attack that reduced the original model accuracy from 95% to less than 50 % by generating mislabeled samples in close vicinity of a selected subset of training points. We also propose a novel Nested Training method as a defense against these attacks. Our defense includes a diversified ensemble of classifiers, each trained on a different subset of the training set. We use the disagreement of the classifiers' predictions as a data sanitization method, and show that an ensemble of 10 SVM classifiers is resilient to a large fraction of poisoning samples, up to 30% of the training data.
Chen, Huiping, Dong, Changyu, Fan, Liyue, Loukides, Grigorios, Pissis, Solon P., Stougie, Leen.
2021.
Differentially Private String Sanitization for Frequency-Based Mining Tasks. 2021 IEEE International Conference on Data Mining (ICDM). :41—50.
Strings are used to model genomic, natural language, and web activity data, and are thus often shared broadly. However, string data sharing has raised privacy concerns stemming from the fact that knowledge of length-k substrings of a string and their frequencies (multiplicities) may be sufficient to uniquely reconstruct the string; and from that the inference of such substrings may leak confidential information. We thus introduce the problem of protecting length-k substrings of a single string S by applying Differential Privacy (DP) while maximizing data utility for frequency-based mining tasks. Our theoretical and empirical evidence suggests that classic DP mechanisms are not suitable to address the problem. In response, we employ the order-k de Bruijn graph G of S and propose a sampling-based mechanism for enforcing DP on G. We consider the task of enforcing DP on G using our mechanism while preserving the normalized edge multiplicities in G. We define an optimization problem on integer edge weights that is central to this task and develop an algorithm based on dynamic programming to solve it exactly. We also consider two variants of this problem with real edge weights. By relaxing the constraint of integer edge weights, we are able to develop linear-time exact algorithms for these variants, which we use as stepping stones towards effective heuristics. An extensive experimental evaluation using real-world large-scale strings (in the order of billions of letters) shows that our heuristics are efficient and produce near-optimal solutions which preserve data utility for frequency-based mining tasks.
Lavi, Bahram, Nascimento, José, Rocha, Anderson.
2021.
Semi-Supervised Feature Embedding for Data Sanitization in Real-World Events. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :2495—2499.
With the rapid growth of data sharing through social media networks, determining relevant data items concerning a particular subject becomes paramount. We address the issue of establishing which images represent an event of interest through a semi-supervised learning technique. The method learns consistent and shared features related to an event (from a small set of examples) to propagate them to an unlabeled set. We investigate the behavior of five image feature representations considering low- and high-level features and their combinations. We evaluate the effectiveness of the feature embedding approach on five collected datasets from real-world events.
Duth, Akshay, Nambiar, Abhinav A, Teja, Chintha Bhanu, Yadav, Sudha.
2021.
Smart Door System with COVID-19 Risk Factor Evaluation, Contactless Data Acquisition and Sanitization. 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS). :1504—1511.
Thousands of people have lost their life by COVID-19 infection. Authorities have seen the calamities caused by the corona virus in China. So, when the trace of virus was found in India, the only possible way to stop the spread of the virus was to go into lockdown. In a country like India where a major part of the population depends on the daily wages, being in lockdown started affecting their life. People where tend to go out for getting the food items and other essentials, and this caused the spread of virus. Many were infected and many lost their life by this. Due to the pandemic, the whole world was affected and many people working in foreign countries lost their jobs as well. These people who came back to India caused further spread of the virus. The main reason for the spread is lack of hygiene and a proper system to monitor the symptoms. Even though our country was in lockdown for almost 6 months the number of COVID cases doesn't get diminished. It is not practical to extend the lockdown any further, and people have decided to live with the virus. But it is essential to take the necessary precautions while interacting with the society. Automated system for checking that all the COVID protocols are followed and early symptom identification before entering to a place are essential to stop the spread of the infection. This research work proposes a smart door system, which evaluates the COVID-19 risk factors and collects the data of person before entering into any place, thereby ensuring that non-infected people are only entering to the place and thus the spread of virus can be avoided.
Yucel, Cagatay, Chalkias, Ioannis, Mallis, Dimitrios, Cetinkaya, Deniz, Henriksen-Bulmer, Jane, Cooper, Alice.
2021.
Data Sanitisation and Redaction for Cyber Threat Intelligence Sharing Platforms. 2021 IEEE International Conference on Cyber Security and Resilience (CSR). :343—347.
The recent technological advances and changes in the daily human activities increased the production and sharing of data. In the ecosystem of interconnected systems, data can be circulated among systems for various reasons. This could lead to exchange of private or sensitive information between entities. Data Sanitisation involves processes and practices that remove sensitive and private information from documents before sharing them with entities that should not have access to this information. This paper presents the design and development of a data sanitisation and redaction solution for a Cyber Threat Intelligence sharing platform. The Data Sanitisation and Redaction Plugin has been designed with the purpose of operating as a plugin for the ECHO Project’s Early Warning System platform and enhancing its operative capabilities during information sharing. This plugin aims to provide automated security and privacy-based controls to the concept of CTI sharing over a ticketing system. The plugin has been successfully tested and the results are presented in this paper.