Visible to the public Biblio

Filters: Keyword is federated learning  [Clear All Filters]
2023-08-24
Wei-Kocsis, Jin, Sabounchi, Moein, Yang, Baijian, Zhang, Tonglin.  2022.  Cybersecurity Education in the Age of Artificial Intelligence: A Novel Proactive and Collaborative Learning Paradigm. 2022 IEEE Frontiers in Education Conference (FIE). :1–5.
This Innovative Practice Work-in-Progress paper presents a virtual, proactive, and collaborative learning paradigm that can engage learners with different backgrounds and enable effective retention and transfer of the multidisciplinary AI-cybersecurity knowledge. While progress has been made to better understand the trustworthiness and security of artificial intelligence (AI) techniques, little has been done to translate this knowledge to education and training. There is a critical need to foster a qualified cybersecurity workforce that understands the usefulness, limitations, and best practices of AI technologies in the cybersecurity domain. To address this import issue, in our proposed learning paradigm, we leverage multidisciplinary expertise in cybersecurity, AI, and statistics to systematically investigate two cohesive research and education goals. First, we develop an immersive learning environment that motivates the students to explore AI/machine learning (ML) development in the context of real-world cybersecurity scenarios by constructing learning models with tangible objects. Second, we design a proactive education paradigm with the use of hackathon activities based on game-based learning, lifelong learning, and social constructivism. The proposed paradigm will benefit a wide range of learners, especially underrepresented students. It will also help the general public understand the security implications of AI. In this paper, we describe our proposed learning paradigm and present our current progress of this ongoing research work. In the current stage, we focus on the first research and education goal and have been leveraging cost-effective Minecraft platform to develop an immersive learning environment where the learners are able to investigate the insights of the emerging AI/ML concepts by constructing related learning modules via interacting with tangible AI/ML building blocks.
ISSN: 2377-634X
2023-07-21
Wenqi, Huang, Lingyu, Liang, Xin, Wang, Zhengguo, Ren, Shang, Cao, Xiaotao, Jiang.  2022.  An Early Warning Analysis Model of Metering Equipment Based on Federated Hybrid Expert System. 2022 15th International Symposium on Computational Intelligence and Design (ISCID). :217—220.
The smooth operation of metering equipment is inseparable from the monitoring and analysis of equipment alarm events by automated metering systems. With the generation of big data in power metering and the increasing demand for information security of metering systems in the power industry, how to use big data and protect data security at the same time has become a hot research field. In this paper, we propose a hybrid expert model based on federated learning to deal with the problem of alarm information analysis and identification. The hybrid expert system can divide the metering warning problem into multiple sub-problems for processing, which greatly improves the recognition and prediction accuracy. The experimental results show that our model has high accuracy in judging and identifying equipment faults.
2023-06-29
Jayakody, Nirosh, Mohammad, Azeem, Halgamuge, Malka N..  2022.  Fake News Detection using a Decentralized Deep Learning Model and Federated Learning. IECON 2022 – 48th Annual Conference of the IEEE Industrial Electronics Society. :1–6.

Social media has beneficial and detrimental impacts on social life. The vast distribution of false information on social media has become a worldwide threat. As a result, the Fake News Detection System in Social Networks has risen in popularity and is now considered an emerging research area. A centralized training technique makes it difficult to build a generalized model by adapting numerous data sources. In this study, we develop a decentralized Deep Learning model using Federated Learning (FL) for fake news detection. We utilize an ISOT fake news dataset gathered from "Reuters.com" (N = 44,898) to train the deep learning model. The performance of decentralized and centralized models is then assessed using accuracy, precision, recall, and F1-score measures. In addition, performance was measured by varying the number of FL clients. We identify the high accuracy of our proposed decentralized FL technique (accuracy, 99.6%) utilizing fewer communication rounds than in previous studies, even without employing pre-trained word embedding. The highest effects are obtained when we compare our model to three earlier research. Instead of a centralized method for false news detection, the FL technique may be used more efficiently. The use of Blockchain-like technologies can improve the integrity and validity of news sources.

ISSN: 2577-1647

2023-03-17
Pham, Hong Thai, Nguyen, Khanh Nam, Phun, Vy Hoa, Dang, Tran Khanh.  2022.  Secure Recommender System based on Neural Collaborative Filtering and Federated Learning. 2022 International Conference on Advanced Computing and Analytics (ACOMPA). :1–11.
A recommender system aims to suggest the most relevant items to users based on their personal data. However, data privacy is a growing concern for anyone. Secure recommender system is a research direction to preserve user privacy while maintaining as high performance as possible. The most recent strategy is to use Federated Learning, a machine learning technique for privacy-preserving distributed training. In Federated Learning, a subset of users will be selected for training model using data at local systems, the server will securely aggregate the computing result from local models to generate a global model, finally that model will give recommendations to users. In this paper, we present a novel algorithm to train Collaborative Filtering recommender system specialized for the ranking task in Federated Learning setting, where the goal is to protect user interaction information (i.e., implicit feedback). Specifically, with the help of the algorithm, the recommender system will be trained by Neural Collaborative Filtering, one of the state-of-the-art matrix factorization methods and Bayesian Personalized Ranking, the most common pairwise approach. In contrast to existing approaches which protect user privacy by requiring users to download/upload the information associated with all interactions that they can possibly interact with in order to perform training, the algorithm can protect user privacy at low communication cost, where users only need to obtain/transfer the information related to a small number of interactions per training iteration. Above all, through extensive experiments, the algorithm has demonstrated to utilize user data more efficient than the most recent research called FedeRank, while ensuring that user privacy is still preserved.
2023-02-02
Wang, Zirui, Duan, Shaoming, Wu, Chengyue, Lin, Wenhao, Zha, Xinyu, Han, Peiyi, Liu, Chuanyi.  2022.  Generative Data Augmentation for Non-IID Problem in Decentralized Clinical Machine Learning. 2022 4th International Conference on Data Intelligence and Security (ICDIS). :336–343.
Swarm learning (SL) is an emerging promising decentralized machine learning paradigm and has achieved high performance in clinical applications. SL solves the problem of a central structure in federated learning by combining edge computing and blockchain-based peer-to-peer network. While there are promising results in the assumption of the independent and identically distributed (IID) data across participants, SL suffers from performance degradation as the degree of the non-IID data increases. To address this problem, we propose a generative augmentation framework in swarm learning called SL-GAN, which augments the non-IID data by generating the synthetic data from participants. SL-GAN trains generators and discriminators locally, and periodically aggregation via a randomly elected coordinator in SL network. Under the standard assumptions, we theoretically prove the convergence of SL-GAN using stochastic approximations. Experimental results demonstrate that SL-GAN outperforms state-of-art methods on three real world clinical datasets including Tuberculosis, Leukemia, COVID-19.
2023-01-20
Nightingale, James S., Wang, Yingjie, Zobiri, Fairouz, Mustafa, Mustafa A..  2022.  Effect of Clustering in Federated Learning on Non-IID Electricity Consumption Prediction. 2022 IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT-Europe). :1—5.

When applied to short-term energy consumption forecasting, the federated learning framework allows for the creation of a predictive model without sharing raw data. There is a limit to the accuracy achieved by standard federated learning due to the heterogeneity of the individual clients' data, especially in the case of electricity data, where prediction of peak demand is a challenge. A set of clustering techniques has been explored in the literature to improve prediction quality while maintaining user privacy. These studies have mainly been conducted using sets of clients with similar attributes that may not reflect real-world consumer diversity. This paper explores, implements and compares these clustering techniques for privacy-preserving load forecasting on a representative electricity consumption dataset. The experimental results demonstrate the effects of electricity consumption heterogeneity on federated forecasting and a non-representative sample's impact on load forecasting.

2023-01-06
Siriwardhana, Yushan, Porambage, Pawani, Liyanage, Madhusanka, Ylianttila, Mika.  2022.  Robust and Resilient Federated Learning for Securing Future Networks. 2022 Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit). :351—356.
Machine Learning (ML) and Artificial Intelligence (AI) techniques are widely adopted in the telecommunication industry, especially to automate beyond 5G networks. Federated Learning (FL) recently emerged as a distributed ML approach that enables localized model training to keep data decentralized to ensure data privacy. In this paper, we identify the applicability of FL for securing future networks and its limitations due to the vulnerability to poisoning attacks. First, we investigate the shortcomings of state-of-the-art security algorithms for FL and perform an attack to circumvent FoolsGold algorithm, which is known as one of the most promising defense techniques currently available. The attack is launched with the addition of intelligent noise at the poisonous model updates. Then we propose a more sophisticated defense strategy, a threshold-based clustering mechanism to complement FoolsGold. Moreover, we provide a comprehensive analysis of the impact of the attack scenario and the performance of the defense mechanism.
Erbil, Pinar, Gursoy, M. Emre.  2022.  Detection and Mitigation of Targeted Data Poisoning Attacks in Federated Learning. 2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). :1—8.
Federated learning (FL) has emerged as a promising paradigm for distributed training of machine learning models. In FL, several participants train a global model collaboratively by only sharing model parameter updates while keeping their training data local. However, FL was recently shown to be vulnerable to data poisoning attacks, in which malicious participants send parameter updates derived from poisoned training data. In this paper, we focus on defending against targeted data poisoning attacks, where the attacker’s goal is to make the model misbehave for a small subset of classes while the rest of the model is relatively unaffected. To defend against such attacks, we first propose a method called MAPPS for separating malicious updates from benign ones. Using MAPPS, we propose three methods for attack detection: MAPPS + X-Means, MAPPS + VAT, and their Ensemble. Then, we propose an attack mitigation approach in which a "clean" model (i.e., a model that is not negatively impacted by an attack) can be trained despite the existence of a poisoning attempt. We empirically evaluate all of our methods using popular image classification datasets. Results show that we can achieve \textgreater 95% true positive rates while incurring only \textless 2% false positive rate. Furthermore, the clean models that are trained using our proposed methods have accuracy comparable to models trained in an attack-free scenario.
Anastasakis, Zacharias, Psychogyios, Konstantinos, Velivassaki, Terpsi, Bourou, Stavroula, Voulkidis, Artemis, Skias, Dimitrios, Gonos, Antonis, Zahariadis, Theodore.  2022.  Enhancing Cyber Security in IoT Systems using FL-based IDS with Differential Privacy. 2022 Global Information Infrastructure and Networking Symposium (GIIS). :30—34.
Nowadays, IoT networks and devices exist in our everyday life, capturing and carrying unlimited data. However, increasing penetration of connected systems and devices implies rising threats for cybersecurity with IoT systems suffering from network attacks. Artificial Intelligence (AI) and Machine Learning take advantage of huge volumes of IoT network logs to enhance their cybersecurity in IoT. However, these data are often desired to remain private. Federated Learning (FL) provides a potential solution which enables collaborative training of attack detection model among a set of federated nodes, while preserving privacy as data remain local and are never disclosed or processed on central servers. While FL is resilient and resolves, up to a point, data governance and ownership issues, it does not guarantee security and privacy by design. Adversaries could interfere with the communication process, expose network vulnerabilities, and manipulate the training process, thus affecting the performance of the trained model. In this paper, we present a federated learning model which can successfully detect network attacks in IoT systems. Moreover, we evaluate its performance under various settings of differential privacy as a privacy preserving technique and configurations of the participating nodes. We prove that the proposed model protects the privacy without actually compromising performance. Our model realizes a limited performance impact of only ∼ 7% less testing accuracy compared to the baseline while simultaneously guaranteeing security and applicability.
2023-01-05
Zhao, Jing, Wang, Ruwu.  2022.  FedMix: A Sybil Attack Detection System Considering Cross-layer Information Fusion and Privacy Protection. 2022 19th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON). :199–207.
Sybil attack is one of the most dangerous internal attacks in Vehicular Ad Hoc Network (VANET). It affects the function of the VANET network by maliciously claiming or stealing multiple identity propagation error messages. In order to prevent VANET from Sybil attacks, many solutions have been proposed. However, the existing solutions are specific to the physical or application layer's single-level data and lack research on cross-layer information fusion detection. Moreover, these schemes involve a large number of sensitive data access and transmission, do not consider users' privacy, and can also bring a severe communication burden, which will make these schemes unable to be actually implemented. In this context, this paper introduces FedMix, the first federated Sybil attack detection system that considers cross-layer information fusion and provides privacy protection. The system can integrate VANET physical layer data and application layer data for joint analyses simultaneously. The data resides locally in the vehicle for local training. Then, the central agency only aggregates the generated model and finally distributes it to the vehicles for attack detection. This process does not involve transmitting and accessing any vehicle's original data. Meanwhile, we also designed a new model aggregation algorithm called SFedAvg to solve the problems of unbalanced vehicle data quality and low aggregation efficiency. Experiments show that FedMix can provide an intelligent model with equivalent performance under the premise of privacy protection and significantly reduce communication overhead, compared with the traditional centralized training attack detection model. In addition, the SFedAvg algorithm and cross-layer information fusion bring better aggregation efficiency and detection performance, respectively.
2022-12-01
Kamhoua, Georges, Bandara, Eranga, Foytik, Peter, Aggarwal, Priyanka, Shetty, Sachin.  2021.  Resilient and Verifiable Federated Learning against Byzantine Colluding Attacks. 2021 Third IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). :31–40.
Federated Learning (FL) is a multiparty learning computing approach that can aid privacy-preservation machine learning. However, FL has several potential security and privacy threats. First, the existing FL requires a central coordinator for the learning process which brings a single point of failure and trust issues for the shared trained model. Second, during the learning process, intentionally unreliable model updates performed by Byzantine colluding parties can lower the quality and convergence of the shared ML models. Therefore, discovering verifiable local model updates (i.e., integrity or correctness) and trusted parties in FL becomes crucial. In this paper, we propose a resilient and verifiable FL algorithm based on a reputation scheme to cope with unreliable parties. We develop a selection algorithm for task publisher and blockchain-based multiparty learning architecture approach where local model updates are securely exchanged and verified without the central party. We also proposed a novel auditing scheme to ensure our proposed approach is resilient up to 50% Byzantine colluding attack in a malicious scenario.
2022-11-18
Tian, Pu, Hatcher, William Grant, Liao, Weixian, Yu, Wei, Blasch, Erik.  2021.  FALIoTSE: Towards Federated Adversarial Learning for IoT Search Engine Resiliency. 2021 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). :290–297.
To improve efficiency and resource usage in data retrieval, an Internet of Things (IoT) search engine organizes a vast amount of scattered data and responds to client queries with processed results. Machine learning provides a deep understanding of complex patterns and enables enhanced feedback to users through well-trained models. Nonetheless, machine learning models are prone to adversarial attacks via the injection of elaborate perturbations, resulting in subverted outputs. Particularly, adversarial attacks on time-series data demand urgent attention, as sensors in IoT systems are collecting an increasing volume of sequential data. This paper investigates adversarial attacks on time-series analysis in an IoT search engine (IoTSE) system. Specifically, we consider the Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN) as our base model, implemented in a simulated federated learning scheme. We propose the Federated Adversarial Learning for IoT Search Engine (FALIoTSE) that exploits the shared parameters of the federated model as the target for adversarial example generation and resiliency. Using a real-world smart parking garage dataset, the impact of an attack on FALIoTSE is demonstrated under various levels of perturbation. The experiments show that the training error increases significantly with noises from the gradient.
2022-11-08
HeydariGorji, Ali, Rezaei, Siavash, Torabzadehkashi, Mahdi, Bobarshad, Hossein, Alves, Vladimir, Chou, Pai H..  2020.  HyperTune: Dynamic Hyperparameter Tuning for Efficient Distribution of DNN Training Over Heterogeneous Systems. 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD). :1–8.
Distributed training is a novel approach to accelerating training of Deep Neural Networks (DNN), but common training libraries fall short of addressing the distributed nature of heterogeneous processors or interruption by other workloads on the shared processing nodes. This paper describes distributed training of DNN on computational storage devices (CSD), which are NAND flash-based, high-capacity data storage with internal processing engines. A CSD-based distributed architecture incorporates the advantages of federated learning in terms of performance scalability, resiliency, and data privacy by eliminating the unnecessary data movement between the storage device and the host processor. The paper also describes Stannis, a DNN training framework that improves on the shortcomings of existing distributed training frameworks by dynamically tuning the training hyperparameters in heterogeneous systems to maintain the maximum overall processing speed in term of processed images per second and energy efficiency. Experimental results on image classification training benchmarks show up to 3.1x improvement in performance and 2.45x reduction in energy consumption when using Stannis plus CSD compare to the generic systems.
2022-10-06
Zhang, Jiachao, Yu, Peiran, Qi, Le, Liu, Song, Zhang, Haiyu, Zhang, Jianzhong.  2021.  FLDDoS: DDoS Attack Detection Model based on Federated Learning. 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :635–642.
Recently, DDoS attack has developed rapidly and become one of the most important threats to the Internet. Traditional machine learning and deep learning methods can-not train a satisfactory model based on the data of a single client. Moreover, in the real scenes, there are a large number of devices used for traffic collection, these devices often do not want to share data between each other depending on the research and analysis value of the attack traffic, which limits the accuracy of the model. Therefore, to solve these problems, we design a DDoS attack detection model based on federated learning named FLDDoS, so that the local model can learn the data of each client without sharing the data. In addition, considering that the distribution of attack detection datasets is extremely imbalanced and the proportion of attack samples is very small, we propose a hierarchical aggregation algorithm based on K-Means and a data resampling method based on SMOTEENN. The result shows that our model improves the accuracy by 4% compared with the traditional method, and reduces the number of communication rounds by 40%.
2022-08-26
Liu, Tianyu, Di, Boya, Wang, Shupeng, Song, Lingyang.  2021.  A Privacy-Preserving Incentive Mechanism for Federated Cloud-Edge Learning. 2021 IEEE Global Communications Conference (GLOBECOM). :1—6.
The federated learning scheme enhances the privacy preservation through avoiding the private data uploading in cloud-edge computing. However, the attacks against the uploaded model updates still cause private data leakage which demotivates the privacy-sensitive participating edge devices. Facing this issue, we aim to design a privacy-preserving incentive mechanism for the federated cloud-edge learning (PFCEL) system such that 1) the edge devices are motivated to actively contribute to the updated model uploading, 2) a trade-off between the private data leakage and the model accuracy is achieved. We formulate the incentive design problem as a three-layer Stackelberg game, where the server-device interaction is further formulated as a contract design problem. Extensive numerical evaluations demonstrate the effectiveness of our designed mechanism in terms of privacy preservation and system utility.
2022-04-26
Yang, Ge, Wang, Shaowei, Wang, Haijie.  2021.  Federated Learning with Personalized Local Differential Privacy. 2021 IEEE 6th International Conference on Computer and Communication Systems (ICCCS). :484–489.

Recently, federated learning (FL), as an advanced and practical solution, has been applied to deal with privacy-preserving issues in distributed multi-party federated modeling. However, most existing FL methods focus on the same privacy-preserving budget while ignoring various privacy requirements of participants. In this paper, we for the first time propose an algorithm (PLU-FedOA) to optimize the deep neural network of horizontal FL with personalized local differential privacy. For such considerations, we design two approaches: PLU, which allows clients to upload local updates under differential privacy-preserving of personally selected privacy level, and FedOA, which helps the server aggregates local parameters with optimized weight in mixed privacy-preserving scenarios. Moreover, we theoretically analyze the effect on privacy and optimization of our approaches. Finally, we verify PLU-FedOA on real-world datasets.

Tekgul, Buse G. A., Xia, Yuxi, Marchal, Samuel, Asokan, N..  2021.  WAFFLE: Watermarking in Federated Learning. 2021 40th International Symposium on Reliable Distributed Systems (SRDS). :310–320.

Federated learning is a distributed learning technique where machine learning models are trained on client devices in which the local training data resides. The training is coordinated via a central server which is, typically, controlled by the intended owner of the resulting model. By avoiding the need to transport the training data to the central server, federated learning improves privacy and efficiency. But it raises the risk of model theft by clients because the resulting model is available on every client device. Even if the application software used for local training may attempt to prevent direct access to the model, a malicious client may bypass any such restrictions by reverse engineering the application software. Watermarking is a well-known deterrence method against model theft by providing the means for model owners to demonstrate ownership of their models. Several recent deep neural network (DNN) watermarking techniques use backdooring: training the models with additional mislabeled data. Backdooring requires full access to the training data and control of the training process. This is feasible when a single party trains the model in a centralized manner, but not in a federated learning setting where the training process and training data are distributed among several client devices. In this paper, we present WAFFLE, the first approach to watermark DNN models trained using federated learning. It introduces a retraining step at the server after each aggregation of local models into the global model. We show that WAFFLE efficiently embeds a resilient watermark into models incurring only negligible degradation in test accuracy (-0.17%), and does not require access to training data. We also introduce a novel technique to generate the backdoor used as a watermark. It outperforms prior techniques, imposing no communication, and low computational (+3.2%) overhead$^\textrm1$$^\textrm1$\$The research report version of this paper is also available in https://arxiv.org/abs/2008.07298, and the code for reproducing our work can be found at https://github.com/ssg-research/WAFFLE.

2022-04-18
Zhang, Junpeng, Li, Mengqian, Zeng, Shuiguang, Xie, Bin, Zhao, Dongmei.  2021.  A Survey on Security and Privacy Threats to Federated Learning. 2021 International Conference on Networking and Network Applications (NaNA). :319–326.
Federated learning (FL) has nourished a promising scheme to solve the data silo, which enables multiple clients to construct a joint model without centralizing data. The critical concerns for flourishing FL applications are that build a security and privacy-preserving learning environment. It is thus highly necessary to comprehensively identify and classify potential threats to utilize FL under security guarantees. This paper starts from the perspective of launched attacks with different computing participants to construct the unique threats classification, highlighting the significant attacks, e.g., poisoning attacks, inference attacks, and generative adversarial networks (GAN) attacks. Our study shows that existing FL protocols do not always provide sufficient security, containing various attacks from both clients and servers. GAN attacks lead to larger significant threats among the kinds of threats given the invisible of the attack process. Moreover, we summarize a detailed review of several defense mechanisms and approaches to resist privacy risks and security breaches. Then advantages and weaknesses are generalized, respectively. Finally, we conclude the paper to prospect the challenges and some potential research directions.
2022-04-12
Guo, Yifan, Wang, Qianlong, Ji, Tianxi, Wang, Xufei, Li, Pan.  2021.  Resisting Distributed Backdoor Attacks in Federated Learning: A Dynamic Norm Clipping Approach. 2021 IEEE International Conference on Big Data (Big Data). :1172—1182.
With the advance in artificial intelligence and high-dimensional data analysis, federated learning (FL) has emerged to allow distributed data providers to collaboratively learn without direct access to local sensitive data. However, limiting access to individual provider’s data inevitably incurs security issues. For instance, backdoor attacks, one of the most popular data poisoning attacks in FL, severely threaten the integrity and utility of the FL system. In particular, backdoor attacks launched by multiple collusive attackers, i.e., distributed backdoor attacks, can achieve high attack success rates and are hard to detect. Existing defensive approaches, like model inspection or model sanitization, often require to access a portion of local training data, which renders them inapplicable to the FL scenarios. Recently, the norm clipping approach is developed to effectively defend against distributed backdoor attacks in FL, which does not rely on local training data. However, we discover that adversaries can still bypass this defense scheme through robust training due to its unchanged norm clipping threshold. In this paper, we propose a novel defense scheme to resist distributed backdoor attacks in FL. Particularly, we first identify that the main reason for the failure of the norm clipping scheme is its fixed threshold in the training process, which cannot capture the dynamic nature of benign local updates during the global model’s convergence. Motivated by it, we devise a novel defense mechanism to dynamically adjust the norm clipping threshold of local updates. Moreover, we provide the convergence analysis of our defense scheme. By evaluating it on four non-IID public datasets, we observe that our defense scheme effectively can resist distributed backdoor attacks and ensure the global model’s convergence. Noticeably, our scheme reduces the attack success rates by 84.23% on average compared with existing defense schemes.
2022-03-23
Jiang, Yupeng, Li, Yong, Zhou, Yipeng, Zheng, Xi.  2021.  Sybil Attacks and Defense on Differential Privacy based Federated Learning. 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :355—362.
In federated learning, machine learning and deep learning models are trained globally on distributed devices. The state-of-the-art privacy-preserving technique in the context of federated learning is user-level differential privacy. However, such a mechanism is vulnerable to some specific model poisoning attacks such as Sybil attacks. A malicious adversary could create multiple fake clients or collude compromised devices in Sybil attacks to mount direct model updates manipulation. Recent works on novel defense against model poisoning attacks are difficult to detect Sybil attacks when differential privacy is utilized, as it masks clients' model updates with perturbation. In this work, we implement the first Sybil attacks on differential privacy based federated learning architectures and show their impacts on model convergence. We randomly compromise some clients by manipulating different noise levels reflected by the local privacy budget ε of differential privacy with Laplace mechanism on the local model updates of these Sybil clients. As a result, the global model convergence rates decrease or even leads to divergence. We apply our attacks to two recent aggregation defense mechanisms, called Krum and Trimmed Mean. Our evaluation results on the MNIST and CIFAR-10 datasets show that our attacks effectively slow down the convergence of the global models. We then propose a method to keep monitoring the average loss of all participants in each round for convergence anomaly detection and defend our Sybil attacks based on the training loss reported from randomly selected sets of clients as the judging panels. Our empirical study demonstrates that our defense effectively mitigates the impact of our Sybil attacks.
2022-03-08
Wang, Xinyi, Yang, Bo, Liu, Qi, Jin, Tiankai, Chen, Cailian.  2021.  Collaboratively Diagnosing IGBT Open-circuit Faults in Photovoltaic Inverters: A Decentralized Federated Learning-based Method. IECON 2021 – 47th Annual Conference of the IEEE Industrial Electronics Society. :1–6.
In photovoltaic (PV) systems, machine learning-based methods have been used for fault detection and diagnosis in the past years, which require large amounts of data. However, fault types in a single PV station are usually insufficient in practice. Due to insufficient and non-identically distributed data, packet loss and privacy concerns, it is difficult to train a model for diagnosing all fault types. To address these issues, in this paper, we propose a decentralized federated learning (FL)-based fault diagnosis method for insulated gate bipolar transistor (IGBT) open-circuits in PV inverters. All PV stations use the convolutional neural network (CNN) to train local diagnosis models. By aggregating neighboring model parameters, each PV station benefits from the fault diagnosis knowledge learned from neighbors and achieves diagnosing all fault types without sharing original data. Extensive experiments are conducted in terms of non-identical data distributions, various transmission channel conditions and whether to use the FL framework. The results are as follows: 1) Using data with non-identical distributions, the collaboratively trained model diagnoses faults accurately and robustly; 2) The continuous transmission and aggregation of model parameters in multiple rounds make it possible to obtain ideal training results even in the presence of packet loss; 3) The proposed method allows each PV station to diagnose all fault types without original data sharing, which protects data privacy.
2022-02-24
Alabbasi, Abdulrahman, Ganjalizadeh, Milad, Vandikas, Konstantinos, Petrova, Marina.  2021.  On Cascaded Federated Learning for Multi-Tier Predictive Models. 2021 IEEE International Conference on Communications Workshops (ICC Workshops). :1–7.
The performance prediction of user equipment (UE) metrics has many applications in the 5G era and beyond. For instance, throughput prediction can improve carrier selection, adaptive video streaming's quality of experience (QoE), and traffic latency. Many studies suggest distributed learning algorithms (e.g., federated learning (FL)) for this purpose. However, in a multi-tier design, features are measured in different tiers, e.g., UE tier, and gNodeB (gNB) tier. On one hand, neglecting the measurements in one tier results in inaccurate predictions. On the other hand, transmitting the data from one tier to another improves the prediction performance at the expense of increasing network overhead and privacy risks. In this paper, we propose cascaded FL to enhance UE throughput prediction with minimum network footprint and privacy ramifications (if any). The idea is to introduce feedback to conventional FL, in multi-tier architectures. Although we use cascaded FL for UE prediction tasks, the idea is rather general and can be used for many prediction problems in multi-tier architectures, such as cellular networks. We evaluate the performance of cascaded FL by detailed and 3GPP compliant simulations of London's city center. Our simulations show that the proposed cascaded FL can achieve up to 54% improvement over conventional FL in the normalized gain, at the cost of 1.8 MB (without quantization) and no cost with quantization.
2022-02-22
Wink, Tobias, Nochta, Zoltan.  2021.  An Approach for Peer-to-Peer Federated Learning. 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :150—157.
We present a novel approach for the collaborative training of neural network models in decentralized federated environments. In the iterative process a group of autonomous peers run multiple training rounds to train a common model. Thereby, participants perform all model training steps locally, such as stochastic gradient descent optimization, using their private, e.g. mission-critical, training datasets. Based on locally updated models, participants can jointly determine a common model by averaging all associated model weights without sharing the actual weight values. For this purpose we introduce a simple n-out-of-n secret sharing schema and an algorithm to calculate average values in a peer-to-peer manner. Our experimental results with deep neural networks on well-known sample datasets prove the generic applicability of the approach, with regard to model quality parameters. Since there is no need to involve a central service provider in model training, the approach can help establish trustworthy collaboration platforms for businesses with high security and data protection requirements.
2022-01-31
Liu, Yong, Zhu, Xinghua, Wang, Jianzong, Xiao, Jing.  2021.  A Quantitative Metric for Privacy Leakage in Federated Learning. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :3065–3069.
In the federated learning system, parameter gradients are shared among participants and the central modulator, while the original data never leave their protected source domain. However, the gradient itself might carry enough information for precise inference of the original data. By reporting their parameter gradients to the central server, client datasets are exposed to inference attacks from adversaries. In this paper, we propose a quantitative metric based on mutual information for clients to evaluate the potential risk of information leakage in their gradients. Mutual information has received increasing attention in the machine learning and data mining community over the past few years. However, existing mutual information estimation methods cannot handle high-dimensional variables. In this paper, we propose a novel method to approximate the mutual information between the high-dimensional gradients and batched input data. Experimental results show that the proposed metric reliably reflect the extent of information leakage in federated learning. In addition, using the proposed metric, we investigate the influential factors of risk level. It is proven that, the risk of information leakage is related to the status of the task model, as well as the inherent data distribution.
2022-01-25
Marulli, Fiammetta, Balzanella, Antonio, Campanile, Lelio, Iacono, Mauro, Mastroianni, Michele.  2021.  Exploring a Federated Learning Approach to Enhance Authorship Attribution of Misleading Information from Heterogeneous Sources. 2021 International Joint Conference on Neural Networks (IJCNN). :1–8.
Authorship Attribution (AA) is currently applied in several applications, among which fraud detection and anti-plagiarism checks: this task can leverage stylometry and Natural Language Processing techniques. In this work, we explored some strategies to enhance the performance of an AA task for the automatic detection of false and misleading information (e.g., fake news). We set up a text classification model for AA based on stylometry exploiting recurrent deep neural networks and implemented two learning tasks trained on the same collection of fake and real news, comparing their performances: one is based on Federated Learning architecture, the other on a centralized architecture. The goal was to discriminate potential fake information from true ones when the fake news comes from heterogeneous sources, with different styles. Preliminary experiments show that a distributed approach significantly improves recall with respect to the centralized model. As expected, precision was lower in the distributed model. This aspect, coupled with the statistical heterogeneity of data, represents some open issues that will be further investigated in future work.