Biblio

List
Filter

Found 111 results

Filters: Keyword is Training data [Clear All Filters]

2023-08-03

Liu, Zhijuan, Zhang, Li, Wu, Xuangou, Zhao, Wei. 2022. Test Case Filtering based on Generative Adversarial Networks. 2022 IEEE 23rd International Conference on High Performance Switching and Routing (HPSR). :65–69.

Fuzzing is a popular technique for finding soft-ware vulnerabilities. Despite their success, the state-of-art fuzzers will inevitably produce a large number of low-quality inputs. In recent years, Machine Learning (ML) based selection strategies have reported promising results. However, the existing ML-based fuzzers are limited by the lack of training data. Because the mutation strategy of fuzzing can not effectively generate useful input, it is prohibitively expensive to collect enough inputs to train models. In this paper, propose a generative adversarial networks based solution to generate a large number of inputs to solve the problem of insufficient data. We implement the proposal in the American Fuzzy Lop (AFL), and the experimental results show that it can find more crashes at the same time compared with the original AFL.

ISSN: 2325-5609

2023-07-21

Said, Dhaou, Elloumi, Mayssa. 2022. A New False Data Injection Detection Protocol based Machine Learning for P2P Energy Transaction between CEVs. 2022 IEEE International Conference on Electrical Sciences and Technologies in Maghreb (CISTEM). 4:1—5.

Without security, any network system loses its efficiency, reliability, and resilience. With the huge integration of the ICT capabilities, the Electric Vehicle (EV) as a transportation form in cities is becoming more and more affordable and able to reply to citizen and environmental expectations. However, the EV vulnerability to cyber-attacks is increasing which intensifies its negative impact on societies. This paper targets the cybersecurity issues for Connected Electric Vehicles (CEVs) in parking lots where a peer-to-peer(P2P) energy transaction system is launched. A False Data Injection Attack (FDIA) on the electricity price signal is considered and a Machine Learning/SVM classification protocol is used to detect and extract the right values. Simulation results are conducted to prove the effectiveness of this proposed model.

Shiqi, Li, Yinghui, Han. 2022. Detection of Bad Data and False Data Injection Based on Back-Propagation Neural Network. 2022 IEEE PES Innovative Smart Grid Technologies - Asia (ISGT Asia). :101—105.

Power system state estimation is an essential tool for monitoring the operating conditions of the grid. However, the collected measurements may not always be reliable due to bad data from various faults as well as the increasing potential of being exposed to cyber-attacks, particularly from data injection attacks. To enhance the accuracy of state estimation, this paper presents a back-propagation neural network to detect and identify bad data and false data injections. A variety of training data exhibiting different statistical properties were used for training. The developed strategy was tested on the IEEE 30-bus and 118-bus power systems using MATLAB. Simulation results revealed the feasibility of the method for the detection and differentiation of bad data and false data injections in various operating scenarios.

Schulze, Jan-Philipp, Sperl, Philip, Böttinger, Konstantin. 2022. Anomaly Detection by Recombining Gated Unsupervised Experts. 2022 International Joint Conference on Neural Networks (IJCNN). :1—8.

Anomaly detection has been considered under several extents of prior knowledge. Unsupervised methods do not require any labelled data, whereas semi-supervised methods leverage some known anomalies. Inspired by mixture-of-experts models and the analysis of the hidden activations of neural networks, we introduce a novel data-driven anomaly detection method called ARGUE. Our method is not only applicable to unsupervised and semi-supervised environments, but also profits from prior knowledge of self-supervised settings. We designed ARGUE as a combination of dedicated expert networks, which specialise on parts of the input data. For its final decision, ARGUE fuses the distributed knowledge across the expert systems using a gated mixture-of-experts architecture. Our evaluation motivates that prior knowledge about the normal data distribution may be as valuable as known anomalies.

2023-07-14

M, Deepa, Dhiipan, J.. 2022. A Meta-Analysis of Efficient Countermeasures for Data Security. 2022 International Conference on Automation, Computing and Renewable Systems (ICACRS). :1303–1308.

Data security is the process of protecting data from loss, alteration, or unauthorised access during its entire lifecycle. It includes everything from the policies and practices of a company to the hardware, software, storage, and user devices used by that company. Data security tools and technology increase transparency into an organization's data and its usage. These tools can protect data by employing methods including encryption and data masking personally identifiable information.. Additionally, the method aids businesses in streamlining their auditing operations and adhering to the increasingly strict data protection rules.

2023-06-30

Lu, Xiaotian, Piao, Chunhui, Han, Jianghe. 2022. Differential Privacy High-dimensional Data Publishing Method Based on Bayesian Network. 2022 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI). :623–627.

Ensuring high data availability while realizing privacy protection is a research hotspot in the field of privacy-preserving data publishing. In view of the instability of data availability in the existing differential privacy high-dimensional data publishing methods based on Bayesian networks, this paper proposes an improved MEPrivBayes privacy-preserving data publishing method, which is mainly improved from two aspects. Firstly, in view of the structural instability caused by the random selection of Bayesian first nodes, this paper proposes a method of first node selection and Bayesian network construction based on the Maximum Information Coefficient Matrix. Then, this paper proposes a privacy budget elastic allocation algorithm: on the basis of pre-setting differential privacy budget coefficients for all branch nodes and all leaf nodes in Bayesian network, the influence of branch nodes on their child nodes and the average correlation degree between leaf nodes and all other nodes are calculated, then get a privacy budget strategy. The SVM multi-classifier is constructed with privacy preserving data as training data set, and the original data set is used as input to evaluate the prediction accuracy in this paper. The experimental results show that the MEPrivBayes method proposed in this paper has higher data availability than the classical PrivBayes method. Especially when the privacy budget is small (noise is large), the availability of the data published by MEPrivBayes decreases less.

2023-06-29

Jayakody, Nirosh, Mohammad, Azeem, Halgamuge, Malka N.. 2022. Fake News Detection using a Decentralized Deep Learning Model and Federated Learning. IECON 2022 – 48th Annual Conference of the IEEE Industrial Electronics Society. :1–6.

Social media has beneficial and detrimental impacts on social life. The vast distribution of false information on social media has become a worldwide threat. As a result, the Fake News Detection System in Social Networks has risen in popularity and is now considered an emerging research area. A centralized training technique makes it difficult to build a generalized model by adapting numerous data sources. In this study, we develop a decentralized Deep Learning model using Federated Learning (FL) for fake news detection. We utilize an ISOT fake news dataset gathered from "Reuters.com" (N = 44,898) to train the deep learning model. The performance of decentralized and centralized models is then assessed using accuracy, precision, recall, and F1-score measures. In addition, performance was measured by varying the number of FL clients. We identify the high accuracy of our proposed decentralized FL technique (accuracy, 99.6%) utilizing fewer communication rounds than in previous studies, even without employing pre-trained word embedding. The highest effects are obtained when we compare our model to three earlier research. Instead of a centralized method for false news detection, the FL technique may be used more efficiently. The use of Blockchain-like technologies can improve the integrity and validity of news sources.

ISSN: 2577-1647

Mahara, Govind Singh, Gangele, Sharad. 2022. Fake news detection: A RNN-LSTM, Bi-LSTM based deep learning approach. 2022 IEEE 1st International Conference on Data, Decision and Systems (ICDDS). :01–06.

Fake news is a new phenomenon that promotes misleading information and fraud via internet social media or traditional news sources. Fake news is readily manufactured and transmitted across numerous social media platforms nowadays, and it has a significant influence on the real world. It is vital to create effective algorithms and tools for detecting misleading information on social media platforms. Most modern research approaches for identifying fraudulent information are based on machine learning, deep learning, feature engineering, graph mining, image and video analysis, and newly built datasets and online services. There is a pressing need to develop a viable approach for readily detecting misleading information. The deep learning LSTM and Bi-LSTM model was proposed as a method for detecting fake news, In this work. First, the NLTK toolkit was used to remove stop words, punctuation, and special characters from the text. The same toolset is used to tokenize and preprocess the text. Since then, GLOVE word embeddings have incorporated higher-level characteristics of the input text extracted from long-term relationships between word sequences captured by the RNN-LSTM, Bi-LSTM model to the preprocessed text. The proposed model additionally employs dropout technology with Dense layers to improve the model's efficiency. The proposed RNN Bi-LSTM-based technique obtains the best accuracy of 94%, and 93% using the Adam optimizer and the Binary cross-entropy loss function with Dropout (0.1,0.2), Once the Dropout range increases it decreases the accuracy of the model as it goes 92% once Dropout (0.3).

2023-06-22

Seetharaman, Sanjay, Malaviya, Shubham, Vasu, Rosni, Shukla, Manish, Lodha, Sachin. 2022. Influence Based Defense Against Data Poisoning Attacks in Online Learning. 2022 14th International Conference on COMmunication Systems & NETworkS (COMSNETS). :1–6.

Data poisoning is a type of adversarial attack on training data where an attacker manipulates a fraction of data to degrade the performance of machine learning model. There are several known defensive mechanisms for handling offline attacks, however defensive measures for online learning, where data points arrive sequentially, have not garnered similar interest. In this work, we propose a defense mechanism to minimize the degradation caused by the poisoned training data on a learner's model in an online setup. Our proposed method utilizes an influence function which is a classic technique in robust statistics. Further, we supplement it with the existing data sanitization methods for filtering out some of the poisoned data points. We study the effectiveness of our defense mechanism on multiple datasets and across multiple attack strategies against an online learner.

ISSN: 2155-2509

2023-05-11

Teo, Jia Wei, Gunawan, Sean, Biswas, Partha P., Mashima, Daisuke. 2022. Evaluating Synthetic Datasets for Training Machine Learning Models to Detect Malicious Commands. 2022 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm). :315–321.

Electrical substations in power grid act as the critical interface points for the transmission and distribution networks. Over the years, digital technology has been integrated into the substations for remote control and automation. As a result, substations are more prone to cyber attacks and exposed to digital vulnerabilities. One of the notable cyber attack vectors is the malicious command injection, which can lead to shutting down of substations and subsequently power outages as demonstrated in Ukraine Power Plant Attack in 2015. Prevailing measures based on cyber rules (e.g., firewalls and intrusion detection systems) are often inadequate to detect advanced and stealthy attacks that use legitimate-looking measurements or control messages to cause physical damage. Additionally, defenses that use physics-based approaches (e.g., power flow simulation, state estimation, etc.) to detect malicious commands suffer from high latency. Machine learning serves as a potential solution in detecting command injection attacks with high accuracy and low latency. However, sufficient datasets are not readily available to train and evaluate the machine learning models. In this paper, focusing on this particular challenge, we discuss various approaches for the generation of synthetic data that can be used to train the machine learning models. Further, we evaluate the models trained with the synthetic data against attack datasets that simulates malicious commands injections with different levels of sophistication. Our findings show that synthetic data generated with some level of power grid domain knowledge helps train robust machine learning models against different types of attacks.

2023-04-28

Lotfollahi, Mahsa, Tran, Nguyen, Gajjela, Chalapathi, Berisha, Sebastian, Han, Zhu, Mayerich, David, Reddy, Rohith. 2022. Adaptive Compressive Sampling for Mid-Infrared Spectroscopic Imaging. 2022 IEEE International Conference on Image Processing (ICIP). :2336–2340.

Mid-infrared spectroscopic imaging (MIRSI) is an emerging class of label-free, biochemically quantitative technologies targeting digital histopathology. Conventional histopathology relies on chemical stains that alter tissue color. This approach is qualitative, often making histopathologic examination subjective and difficult to quantify. MIRSI addresses these challenges through quantitative and repeatable imaging that leverages native molecular contrast. Fourier transform infrared (FTIR) imaging, the best-known MIRSI technology, has two challenges that have hindered its widespread adoption: data collection speed and spatial resolution. Recent technological breakthroughs, such as photothermal MIRSI, provide an order of magnitude improvement in spatial resolution. However, this comes at the cost of acquisition speed, which is impractical for clinical tissue samples. This paper introduces an adaptive compressive sampling technique to reduce hyperspectral data acquisition time by an order of magnitude by leveraging spectral and spatial sparsity. This method identifies the most informative spatial and spectral features, integrates a fast tensor completion algorithm to reconstruct megapixel-scale images, and demonstrates speed advantages over FTIR imaging while providing spatial resolutions comparable to new photothermal approaches.

ISSN: 2381-8549

Li, Zongjie, Ma, Pingchuan, Wang, Huaijin, Wang, Shuai, Tang, Qiyi, Nie, Sen, Wu, Shi. 2022. Unleashing the Power of Compiler Intermediate Representation to Enhance Neural Program Embeddings. 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE). :2253–2265.

Neural program embeddings have demonstrated considerable promise in a range of program analysis tasks, including clone identification, program repair, code completion, and program synthesis. However, most existing methods generate neural program embeddings di-rectly from the program source codes, by learning from features such as tokens, abstract syntax trees, and control flow graphs. This paper takes a fresh look at how to improve program embed-dings by leveraging compiler intermediate representation (IR). We first demonstrate simple yet highly effective methods for enhancing embedding quality by training embedding models alongside source code and LLVM IR generated by default optimization levels (e.g., -02). We then introduce IRGEN, a framework based on genetic algorithms (GA), to identify (near-)optimal sequences of optimization flags that can significantly improve embedding quality. We use IRGEN to find optimal sequences of LLVM optimization flags by performing GA on source code datasets. We then extend a popular code embedding model, CodeCMR, by adding a new objective based on triplet loss to enable a joint learning over source code and LLVM IR. We benchmark the quality of embedding using a rep-resentative downstream application, code clone detection. When CodeCMR was trained with source code and LLVM IRs optimized by findings of IRGEN, the embedding quality was significantly im-proved, outperforming the state-of-the-art model, CodeBERT, which was trained only with source code. Our augmented CodeCMR also outperformed CodeCMR trained over source code and IR optimized with default optimization levels. We investigate the properties of optimization flags that increase embedding quality, demonstrate IRGEN's generalization in boosting other embedding models, and establish IRGEN's use in settings with extremely limited training data. Our research and findings demonstrate that a straightforward addition to modern neural code embedding models can provide a highly effective enhancement.

2023-03-31

Kahla, Mostafa, Chen, Si, Just, Hoang Anh, Jia, Ruoxi. 2022. Label-Only Model Inversion Attacks via Boundary Repulsion. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :15025–15033.

Recent studies show that the state-of-the-art deep neural networks are vulnerable to model inversion attacks, in which access to a model is abused to reconstruct private training data of any given target class. Existing attacks rely on having access to either the complete target model (whitebox) or the model's soft-labels (blackbox). However, no prior work has been done in the harder but more practical scenario, in which the attacker only has access to the model's predicted label, without a confidence measure. In this paper, we introduce an algorithm, Boundary-Repelling Model Inversion (BREP-MI), to invert private training data using only the target model's predicted labels. The key idea of our algorithm is to evaluate the model's predicted labels over a sphere and then estimate the direction to reach the target class's centroid. Using the example of face recognition, we show that the images reconstructed by BREP-MI successfully reproduce the semantics of the private training data for various datasets and target model architectures. We compare BREP-MI with the state-of-the-art white-box and blackbox model inversion attacks, and the results show that despite assuming less knowledge about the target model, BREP-MI outperforms the blackbox attack and achieves comparable results to the whitebox attack. Our code is available online.11https://github.com/m-kahla/Label-Only-Model-Inversion-Attacks-via-Boundary-Repulsion

Zhou, Linjun, Cui, Peng, Zhang, Xingxuan, Jiang, Yinan, Yang, Shiqiang. 2022. Adversarial Eigen Attack on BlackBox Models. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :15233–15241.

Black-box adversarial attack has aroused much research attention for its difficulty on nearly no available information of the attacked model and the additional constraint on the query budget. A common way to improve attack efficiency is to transfer the gradient information of a white-box substitute model trained on an extra dataset. In this paper, we deal with a more practical setting where a pre-trained white-box model with network parameters is provided without extra training data. To solve the model mismatch problem between the white-box and black-box models, we propose a novel algorithm EigenBA by systematically integrating gradient-based white-box method and zeroth-order optimization in black-box methods. We theoretically show the optimal directions of perturbations for each step are closely related to the right singular vectors of the Jacobian matrix of the pretrained white-box model. Extensive experiments on ImageNet, CIFAR-10 and WebVision show that EigenBA can consistently and significantly outperform state-of-the-art baselines in terms of success rate and attack efficiency.

Zhang, Jie, Li, Bo, Xu, Jianghe, Wu, Shuang, Ding, Shouhong, Zhang, Lei, Wu, Chao. 2022. Towards Efficient Data Free Blackbox Adversarial Attack. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :15094–15104.

Classic black-box adversarial attacks can take advantage of transferable adversarial examples generated by a similar substitute model to successfully fool the target model. However, these substitute models need to be trained by target models' training data, which is hard to acquire due to privacy or transmission reasons. Recognizing the limited availability of real data for adversarial queries, recent works proposed to train substitute models in a data-free black-box scenario. However, their generative adversarial networks (GANs) based framework suffers from the convergence failure and the model collapse, resulting in low efficiency. In this paper, by rethinking the collaborative relationship between the generator and the substitute model, we design a novel black-box attack framework. The proposed method can efficiently imitate the target model through a small number of queries and achieve high attack success rate. The comprehensive experiments over six datasets demonstrate the effectiveness of our method against the state-of-the-art attacks. Especially, we conduct both label-only and probability-only attacks on the Microsoft Azure online model, and achieve a 100% attack success rate with only 0.46% query budget of the SOTA method [49].

2023-03-06

Le, Trung-Nghia, Akihiro, Sugimoto, Ono, Shintaro, Kawasaki, Hiroshi. 2020. Toward Interactive Self-Annotation For Video Object Bounding Box: Recurrent Self-Learning And Hierarchical Annotation Based Framework. 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). :3220–3229.

Amount and variety of training data drastically affect the performance of CNNs. Thus, annotation methods are becoming more and more critical to collect data efficiently. In this paper, we propose a simple yet efficient Interactive Self-Annotation framework to cut down both time and human labor cost for video object bounding box annotation. Our method is based on recurrent self-supervised learning and consists of two processes: automatic process and interactive process, where the automatic process aims to build a supported detector to speed up the interactive process. In the Automatic Recurrent Annotation, we let an off-the-shelf detector watch unlabeled videos repeatedly to reinforce itself automatically. At each iteration, we utilize the trained model from the previous iteration to generate better pseudo ground-truth bounding boxes than those at the previous iteration, recurrently improving self-supervised training the detector. In the Interactive Recurrent Annotation, we tackle the human-in-the-loop annotation scenario where the detector receives feedback from the human annotator. To this end, we propose a novel Hierarchical Correction module, where the annotated frame-distance binarizedly decreases at each time step, to utilize the strength of CNN for neighbor frames. Experimental results on various video datasets demonstrate the advantages of the proposed framework in generating high-quality annotations while reducing annotation time and human labor costs.

ISSN: 2642-9381

2023-02-03

Nelson, Jared Ray, Shekaramiz, Mohammad. 2022. Authorship Verification via Linear Correlation Methods of n-gram and Syntax Metrics. 2022 Intermountain Engineering, Technology and Computing (IETC). :1–6.

This research evaluates the accuracy of two methods of authorship prediction: syntactical analysis and n-gram, and explores its potential usage. The proposed algorithm measures n-gram, and counts adjectives, adverbs, verbs, nouns, punctuation, and sentence length from the training data, and normalizes each metric. The proposed algorithm compares the metrics of training samples to testing samples and predicts authorship based on the correlation they share for each metric. The severity of correlation between the testing and training data produces significant weight in the decision-making process. For example, if analysis of one metric approximates 100% positive correlation, the weight in the decision is assigned a maximum value for that metric. Conversely, a 100% negative correlation receives the minimum value. This new method of authorship validation holds promise for future innovation in fraud protection, the study of historical documents, and maintaining integrity within academia.

Chakraborty, Joymallya, Majumder, Suvodeep, Tu, Huy. 2022. Fair-SSL: Building fair ML Software with less data. 2022 IEEE/ACM International Workshop on Equitable Data & Technology (FairWare). :1–8.

Ethical bias in machine learning models has become a matter of concern in the software engineering community. Most of the prior software engineering works concentrated on finding ethical bias in models rather than fixing it. After finding bias, the next step is mitigation. Prior researchers mainly tried to use supervised approaches to achieve fairness. However, in the real world, getting data with trustworthy ground truth is challenging and also ground truth can contain human bias. Semi-supervised learning is a technique where, incrementally, labeled data is used to generate pseudo-labels for the rest of data (and then all that data is used for model training). In this work, we apply four popular semi-supervised techniques as pseudo-labelers to create fair classification models. Our framework, Fair-SSL, takes a very small amount (10%) of labeled data as input and generates pseudo-labels for the unlabeled data. We then synthetically generate new data points to balance the training data based on class and protected attribute as proposed by Chakraborty et al. in FSE 2021. Finally, classification model is trained on the balanced pseudo-labeled data and validated on test data. After experimenting on ten datasets and three learners, we find that Fair-SSL achieves similar performance as three state-of-the-art bias mitigation algorithms. That said, the clear advantage of Fair-SSL is that it requires only 10% of the labeled training data. To the best of our knowledge, this is the first SE work where semi-supervised techniques are used to fight against ethical bias in SE ML models. To facilitate open science and replication, all our source code and datasets are publicly available at https://github.com/joymallyac/FairSSL. CCS CONCEPTS • Software and its engineering → Software creation and management; • Computing methodologies → Machine learning. ACM Reference Format: Joymallya Chakraborty, Suvodeep Majumder, and Huy Tu. 2022. Fair-SSL: Building fair ML Software with less data. In International Workshop on Equitable Data and Technology (FairWare ‘22), May 9, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3524491.3527305

Muliono, Yohan, Darus, Mohamad Yusof, Pardomuan, Chrisando Ryan, Ariffin, Muhammad Azizi Mohd, Kurniawan, Aditya. 2022. Predicting Confidentiality, Integrity, and Availability from SQL Injection Payload. 2022 International Conference on Information Management and Technology (ICIMTech). :600–605.

SQL Injection has been around as a harmful and prolific threat on web applications for more than 20 years, yet it still poses a huge threat to the World Wide Web. Rapidly evolving web technology has not eradicated this threat; In 2017 51 % of web application attacks are SQL injection attacks. Most conventional practices to prevent SQL injection attacks revolves around secure web and database programming and administration techniques. Despite developer ignorance, a large number of online applications remain susceptible to SQL injection attacks. There is a need for a more effective method to detect and prevent SQL Injection attacks. In this research, we offer a unique machine learning-based strategy for identifying potential SQL injection attack (SQL injection attack) threats. Application of the proposed method in a Security Information and Event Management(SIEM) system will be discussed. SIEM can aggregate and normalize event information from multiple sources, and detect malicious events from analysis of these information. The result of this work shows that a machine learning based SQL injection attack detector which uses SIEM approach possess high accuracy in detecting malicious SQL queries.

2023-01-20

Leak, Matthew Haslett, Venayagamoorthy, Ganesh Kumar. 2022. Situational Awareness of De-energized Lines During Loss of SCADA Communication in Electric Power Distribution Systems. 2022 IEEE/PES Transmission and Distribution Conference and Exposition (T&D). :1–5.

With the electric power distribution grid facing ever increasing complexity and new threats from cyber-attacks, situational awareness for system operators is quickly becoming indispensable. Identifying de-energized lines on the distribution system during a SCADA communication failure is a prime example where operators need to act quickly to deal with an emergent loss of service. Loss of cellular towers, poor signal strength, and even cyber-attacks can impact SCADA visibility of line devices on the distribution system. Neural Networks (NNs) provide a unique approach to learn the characteristics of normal system behavior, identify when abnormal conditions occur, and flag these conditions for system operators. This study applies a 24-hour load forecast for distribution line devices given the weather forecast and day of the week, then determines the current state of distribution devices based on changes in SCADA analogs from communicating line devices. A neural network-based algorithm is applied to historical events on Alabama Power's distribution system to identify de-energized sections of line when a significant amount of SCADA information is hidden.

Omeroglu, Asli Nur, Mohammed, Hussein M. A., Oral, E. Argun, Yucel Ozbek, I.. 2022. Detection of Moving Target Direction for Ground Surveillance Radar Based on Deep Learning. 2022 30th Signal Processing and Communications Applications Conference (SIU). :1–4.

In defense and security applications, detection of moving target direction is as important as the target detection and/or target classification. In this study, a methodology for the detection of different mobile targets as approaching or receding was proposed for ground surveillance radar data, and convolutional neural networks (CNN) based on transfer learning were employed for this purpose. In order to improve the classification performance, the use of two key concepts, namely Deep Convolutional Generative Adversarial Network (DCGAN) and decision fusion, has been proposed. With DCGAN, the number of limited available data used for training was increased, thus creating a bigger training dataset with identical distribution to the original data for both moving directions. This generated synthetic data was then used along with the original training data to train three different pre-trained deep convolutional networks. Finally, the classification results obtained from these networks were combined with decision fusion approach. In order to evaluate the performance of the proposed method, publicly available RadEch dataset consisting of eight ground target classes was utilized. Based on the experimental results, it was observed that the combined use of the proposed DCGAN and decision fusion methods increased the detection accuracy of moving target for person, vehicle, group of person and all target groups, by 13.63%, 10.01%, 14.82% and 8.62%, respectively.

2023-01-06

Chen, Tianlong, Zhang, Zhenyu, Zhang, Yihua, Chang, Shiyu, Liu, Sijia, Wang, Zhangyang. 2022. Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :588—599.

Trojan attacks threaten deep neural networks (DNNs) by poisoning them to behave normally on most samples, yet to produce manipulated results for inputs attached with a particular trigger. Several works attempt to detect whether a given DNN has been injected with a specific trigger during the training. In a parallel line of research, the lottery ticket hypothesis reveals the existence of sparse sub-networks which are capable of reaching competitive performance as the dense network after independent training. Connecting these two dots, we investigate the problem of Trojan DNN detection from the brand new lens of sparsity, even when no clean training data is available. Our crucial observation is that the Trojan features are significantly more stable to network pruning than benign features. Leveraging that, we propose a novel Trojan network detection regime: first locating a “winning Trojan lottery ticket” which preserves nearly full Trojan information yet only chance-level performance on clean inputs; then recovering the trigger embedded in this already isolated sub-network. Extensive experiments on various datasets, i.e., CIFAR-10, CIFAR-100, and ImageNet, with different network architectures, i.e., VGG-16, ResNet-18, ResNet-20s, and DenseNet-100 demonstrate the effectiveness of our proposal. Codes are available at https://github.com/VITA-Group/Backdoor-LTH.

Erbil, Pinar, Gursoy, M. Emre. 2022. Detection and Mitigation of Targeted Data Poisoning Attacks in Federated Learning. 2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). :1—8.

Federated learning (FL) has emerged as a promising paradigm for distributed training of machine learning models. In FL, several participants train a global model collaboratively by only sharing model parameter updates while keeping their training data local. However, FL was recently shown to be vulnerable to data poisoning attacks, in which malicious participants send parameter updates derived from poisoned training data. In this paper, we focus on defending against targeted data poisoning attacks, where the attacker’s goal is to make the model misbehave for a small subset of classes while the rest of the model is relatively unaffected. To defend against such attacks, we first propose a method called MAPPS for separating malicious updates from benign ones. Using MAPPS, we propose three methods for attack detection: MAPPS + X-Means, MAPPS + VAT, and their Ensemble. Then, we propose an attack mitigation approach in which a "clean" model (i.e., a model that is not negatively impacted by an attack) can be trained despite the existence of a poisoning attempt. We empirically evaluate all of our methods using popular image classification datasets. Results show that we can achieve \textgreater 95% true positive rates while incurring only \textless 2% false positive rate. Furthermore, the clean models that are trained using our proposed methods have accuracy comparable to models trained in an attack-free scenario.

Fan, Jiaxin, Yan, Qi, Li, Mohan, Qu, Guanqun, Xiao, Yang. 2022. A Survey on Data Poisoning Attacks and Defenses. 2022 7th IEEE International Conference on Data Science in Cyberspace (DSC). :48—55.

With the widespread deployment of data-driven services, the demand for data volumes continues to grow. At present, many applications lack reliable human supervision in the process of data collection, which makes the collected data contain low-quality data or even malicious data. This low-quality or malicious data make AI systems potentially face much security challenges. One of the main security threats in the training phase of machine learning is data poisoning attacks, which compromise model integrity by contaminating training data to make the resulting model skewed or unusable. This paper reviews the relevant researches on data poisoning attacks in various task environments: first, the classification of attacks is summarized, then the defense methods of data poisoning attacks are sorted out, and finally, the possible research directions in the prospect.

Alotaibi, Jamal, Alazzawi, Lubna. 2022. PPIoV: A Privacy Preserving-Based Framework for IoV- Fog Environment Using Federated Learning and Blockchain. 2022 IEEE World AI IoT Congress (AIIoT). :597—603.

The integration of the Internet-of-Vehicles (IoV) and fog computing benefits from cooperative computing and analysis of environmental data while avoiding network congestion and latency. However, when private data is shared across fog nodes or the cloud, there exist privacy issues that limit the effectiveness of IoV systems, putting drivers' safety at risk. To address this problem, we propose a framework called PPIoV, which is based on Federated Learning (FL) and Blockchain technologies to preserve the privacy of vehicles in IoV.Typical machine learning methods are not well suited for distributed and highly dynamic systems like IoV since they train on data with local features. Therefore, we use FL to train the global model while preserving privacy. Also, our approach is built on a scheme that evaluates the reliability of vehicles participating in the FL training process. Moreover, PPIoV is built on blockchain to establish trust across multiple communication nodes. For example, when the local learned model updates from the vehicles and fog nodes are communicated with the cloud to update the global learned model, all transactions take place on the blockchain. The outcome of our experimental study shows that the proposed method improves the global model's accuracy as a result of allowing reputed vehicles to update the global model.