Visible to the public Biblio

Filters: Keyword is Learning systems  [Clear All Filters]
2021-12-20
Masuda, Hiroki, Kita, Kentaro, Koizumi, Yuki, Takemasa, Junji, Hasegawa, Toru.  2021.  Model Fragmentation, Shuffle and Aggregation to Mitigate Model Inversion in Federated Learning. 2021 IEEE International Symposium on Local and Metropolitan Area Networks (LANMAN). :1–6.
Federated learning is a privacy-preserving learning system where participants locally update a shared model with their own training data. Despite the advantage that training data are not sent to a server, there is still a risk that a state-of-the-art model inversion attack, which may be conducted by the server, infers training data from the models updated by the participants, referred to as individual models. A solution to prevent such attacks is differential privacy, where each participant adds noise to the individual model before sending it to the server. Differential privacy, however, sacrifices the quality of the shared model in compensation for the fact that participants' training data are not leaked. This paper proposes a federated learning system that is resistant to model inversion attacks without sacrificing the quality of the shared model. The core idea is that each participant divides the individual model into model fragments, shuffles, and aggregates them to prevent adversaries from inferring training data. The other benefit of the proposed system is that the resulting shared model is identical to the shared model generated with the naive federated learning.
2021-09-30
Wang, Wei, Liu, Tieyuan, Chang, Liang, Gu, Tianlong, Zhao, Xuemei.  2020.  Convolutional Recurrent Neural Networks for Knowledge Tracing. 2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). :287–290.
Knowledge Tracing (KT) is a task that aims to assess students' mastery level of knowledge and predict their performance over questions, which has attracted widespread attention over the years. Recently, an increasing number of researches have applied deep learning techniques to knowledge tracing and have made a huge success over traditional Bayesian Knowledge Tracing methods. Most existing deep learning-based methods utilized either Recurrent Neural Networks (RNNs) or Convolutional Neural Networks (CNNs). However, it is worth noticing that these two sorts of models are complementary in modeling abilities. Thus, in this paper, we propose a novel knowledge tracing model by taking advantage of both two models via combining them into a single integrated model, named Convolutional Recurrent Knowledge Tracing (CRKT). Extensive experiments show that our model outperforms the state-of-the-art models in multiple KT datasets.
2021-07-07
G H, Samyama Gunjal, Swamy, Samarth C.  2020.  A Security Approach to Build a Trustworthy Ubiquitous Learning System. 2020 IEEE Bangalore Humanitarian Technology Conference (B-HTC). :1–6.
Modern learning systems, say a tutoring platform, has many characteristics like digital data presentation with interactivity, mobility, which provides information about the study-content as per the learners understanding levels, intelligent learners behavior, etc. A sophisticated ubiquitous learner system maintains security and monitors the mischievous behavior of the learner, and authenticates and authorizes every learner, which is quintessential. Some of the existing security schemes aim only at single entry-point authentication, which may not suit to ubiquitous tutor platform. We propose a secured authentication scheme which is based on the information utility of the learner. Whenever a learner moves into a tutor platform, which has ubiquitous learner system technology, the system at first-begins with learners' identity authentication, and then it initiates trust evaluation after the successful authentication of the learner. Periodic credential verification of the learner will be carried out, which intensifies the authentication scheme of the system proposed. BAN logic has been used to prove the authentication in this system. The proposed authentication scheme has been simulated and analyzed for the indoor tutor platform environment.
2021-05-18
Feng, Qi, Feng, Chendong, Hong, Weijiang.  2020.  Graph Neural Network-based Vulnerability Predication. 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). :800–801.
Automatic vulnerability detection is challenging. In this paper, we report our in-progress work of vulnerability prediction based on graph neural network (GNN). We propose a general GNN-based framework for predicting the vulnerabilities in program functions. We study the different instantiations of the framework in representative program graph representations, initial node encodings, and GNN learning methods. The preliminary experimental results on a representative benchmark indicate that the GNN-based method can improve the accuracy and recall rates of vulnerability prediction.
2021-01-11
Jiang, P., Liao, S..  2020.  Differential Privacy Online Learning Based on the Composition Theorem. 2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC). :200–203.
Privacy protection is becoming more and more important in the era of big data. Differential privacy is a rigorous and provable privacy protection method that can protect privacy for a single piece of data. But existing differential privacy online learning methods have great limitations in the scope of application and accuracy. Aiming at this problem, we propose a more general and accurate algorithm, named DPOL-CT, for differential privacy online learning. We first distinguish the difference in differential privacy protection between offline learning and online learning. Then we prove that the DPOL-CT algorithm achieves (∊, δ)-differential privacy for online learning under the Gaussian, the Laplace and the Staircase mechanisms and enjoys a sublinear expected regret bound. We further discuss the trade-off between the differential privacy level and the regret bound. Theoretical analysis and experimental results show that the DPOL-CT algorithm has good performance guarantees.
2020-12-11
Zhou, Y., Zeng, Z..  2019.  Info-Retrieval with Relevance Feedback using Hybrid Learning Scheme for RS Image. 2019 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). :135—138.

Relevance feedback can be considered as a learning problem. It has been extensively used to improve the performance of retrieval multimedia information. In this paper, after the relevance feedback upon content-based image retrieval (CBIR) discussed, a hybrid learning scheme on multi-target retrieval (MTR) with relevance feedback was proposed. Suppose the symbolic image database (SID) of object-level with combined image metadata and feature model was constructed. During the interactive query for remote sensing image, we calculate the similarity metric so as to get the relevant image sets from the image library. For the purpose of further improvement of the precision of image retrieval, a hybrid learning scheme parameter also need to be chosen. As a result, the idea of our hybrid learning scheme contains an exception maximization algorithm (EMA) used for retrieving the most relevant images from SID and an algorithm called supported vector machine (SVM) with relevance feedback used for learning the feedback information substantially. Experimental results show that our hybrid learning scheme with relevance feedback on MTR can improve the performance and accuracy compared the basic algorithms.

2020-07-20
Pengcheng, Li, Yi, Jinfeng, Zhang, Lijun.  2018.  Query-Efficient Black-Box Attack by Active Learning. 2018 IEEE International Conference on Data Mining (ICDM). :1200–1205.
Deep neural network (DNN) as a popular machine learning model is found to be vulnerable to adversarial attack. This attack constructs adversarial examples by adding small perturbations to the raw input, while appearing unmodified to human eyes but will be misclassified by a well-trained classifier. In this paper, we focus on the black-box attack setting where attackers have almost no access to the underlying models. To conduct black-box attack, a popular approach aims to train a substitute model based on the information queried from the target DNN. The substitute model can then be attacked using existing white-box attack approaches, and the generated adversarial examples will be used to attack the target DNN. Despite its encouraging results, this approach suffers from poor query efficiency, i.e., attackers usually needs to query a huge amount of times to collect enough information for training an accurate substitute model. To this end, we first utilize state-of-the-art white-box attack methods to generate samples for querying, and then introduce an active learning strategy to significantly reduce the number of queries needed. Besides, we also propose a diversity criterion to avoid the sampling bias. Our extensive experimental results on MNIST and CIFAR-10 show that the proposed method can reduce more than 90% of queries while preserve attacking success rates and obtain an accurate substitute model which is more than 85% similar with the target oracle.
2020-01-20
Ou, Chung-Ming.  2019.  Host-based Intrusion Detection Systems Inspired by Machine Learning of Agent-Based Artificial Immune Systems. 2019 IEEE International Symposium on INnovations in Intelligent SysTems and Applications (INISTA). :1–5.

An adaptable agent-based IDS (AAIDS) inspired by the danger theory of artificial immune system is proposed. The learning mechanism of AAIDS is designed by emulating how dendritic cells (DC) in immune systems detect and classify danger signals. AG agent, DC agent and TC agent coordinate together and respond to system calls directly rather than analyze network packets. Simulations show AAIDS can determine several critical scenarios of the system behaviors where packet analysis is impractical.

2019-06-10
Kalash, M., Rochan, M., Mohammed, N., Bruce, N. D. B., Wang, Y., Iqbal, F..  2018.  Malware Classification with Deep Convolutional Neural Networks. 2018 9th IFIP International Conference on New Technologies, Mobility and Security (NTMS). :1-5.

In this paper, we propose a deep learning framework for malware classification. There has been a huge increase in the volume of malware in recent years which poses a serious security threat to financial institutions, businesses and individuals. In order to combat the proliferation of malware, new strategies are essential to quickly identify and classify malware samples so that their behavior can be analyzed. Machine learning approaches are becoming popular for classifying malware, however, most of the existing machine learning methods for malware classification use shallow learning algorithms (e.g. SVM). Recently, Convolutional Neural Networks (CNN), a deep learning approach, have shown superior performance compared to traditional learning algorithms, especially in tasks such as image classification. Motivated by this success, we propose a CNN-based architecture to classify malware samples. We convert malware binaries to grayscale images and subsequently train a CNN for classification. Experiments on two challenging malware classification datasets, Malimg and Microsoft malware, demonstrate that our method achieves better than the state-of-the-art performance. The proposed method achieves 98.52% and 99.97% accuracy on the Malimg and Microsoft datasets respectively.

2018-04-11
Deliu, I., Leichter, C., Franke, K..  2017.  Extracting Cyber Threat Intelligence from Hacker Forums: Support Vector Machines versus Convolutional Neural Networks. 2017 IEEE International Conference on Big Data (Big Data). :3648–3656.

Hacker forums and other social platforms may contain vital information about cyber security threats. But using manual analysis to extract relevant threat information from these sources is a time consuming and error-prone process that requires a significant allocation of resources. In this paper, we explore the potential of Machine Learning methods to rapidly sift through hacker forums for relevant threat intelligence. Utilizing text data from a real hacker forum, we compared the text classification performance of Convolutional Neural Network methods against more traditional Machine Learning approaches. We found that traditional machine learning methods, such as Support Vector Machines, can yield high levels of performance that are on par with Convolutional Neural Network algorithms.

2018-02-14
Nam, C., Walker, P., Lewis, M., Sycara, K..  2017.  Predicting trust in human control of swarms via inverse reinforcement learning. 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN). :528–533.
In this paper, we study the model of human trust where an operator controls a robotic swarm remotely for a search mission. Existing trust models in human-in-the-loop systems are based on task performance of robots. However, we find that humans tend to make their decisions based on physical characteristics of the swarm rather than its performance since task performance of swarms is not clearly perceivable by humans. We formulate trust as a Markov decision process whose state space includes physical parameters of the swarm. We employ an inverse reinforcement learning algorithm to learn behaviors of the operator from a single demonstration. The learned behaviors are used to predict the trust level of the operator based on the features of the swarm.
Feng, C., Wu, S., Liu, N..  2017.  A user-centric machine learning framework for cyber security operations center. 2017 IEEE International Conference on Intelligence and Security Informatics (ISI). :173–175.

To assure cyber security of an enterprise, typically SIEM (Security Information and Event Management) system is in place to normalize security events from different preventive technologies and flag alerts. Analysts in the security operation center (SOC) investigate the alerts to decide if it is truly malicious or not. However, generally the number of alerts is overwhelming with majority of them being false positive and exceeding the SOC's capacity to handle all alerts. Because of this, potential malicious attacks and compromised hosts may be missed. Machine learning is a viable approach to reduce the false positive rate and improve the productivity of SOC analysts. In this paper, we develop a user-centric machine learning framework for the cyber security operation center in real enterprise environment. We discuss the typical data sources in SOC, their work flow, and how to leverage and process these data sets to build an effective machine learning system. The paper is targeted towards two groups of readers. The first group is data scientists or machine learning researchers who do not have cyber security domain knowledge but want to build machine learning systems for security operations center. The second group of audiences are those cyber security practitioners who have deep knowledge and expertise in cyber security, but do not have machine learning experiences and wish to build one by themselves. Throughout the paper, we use the system we built in the Symantec SOC production environment as an example to demonstrate the complete steps from data collection, label creation, feature engineering, machine learning algorithm selection, model performance evaluations, to risk score generation.

2017-12-28
Mehetrey, P., Shahriari, B., Moh, M..  2016.  Collaborative Ensemble-Learning Based Intrusion Detection Systems for Clouds. 2016 International Conference on Collaboration Technologies and Systems (CTS). :404–411.

Cloud computation has become prominent with seemingly unlimited amount of storage and computation available to users. Yet, security is a major issue that hampers the growth of cloud. In this research we investigate a collaborative Intrusion Detection System (IDS) based on the ensemble learning method. It uses weak classifiers, and allows the use of untapped resources of cloud to detect various types of attacks on the cloud system. In the proposed system, tasks are distributed among available virtual machines (VM), individual results are then merged for the final adaptation of the learning model. Performance evaluation is carried out using decision trees and using fuzzy classifiers, on KDD99, one of the largest datasets for IDS. Segmentation of the dataset is done in order to mimic the behavior of real-time data traffic occurred in a real cloud environment. The experimental results show that the proposed approach reduces the execution time with improved accuracy, and is fault-tolerant when handling VM failures. The system is a proof-of-concept model for a scalable, cloud-based distributed system that is able to explore untapped resources, and may be used as a base model for a real-time hierarchical IDS.

2017-12-12
Feng, W., Yan, W., Wu, S., Liu, N..  2017.  Wavelet transform and unsupervised machine learning to detect insider threat on cloud file-sharing. 2017 IEEE International Conference on Intelligence and Security Informatics (ISI). :155–157.

As increasingly more enterprises are deploying cloud file-sharing services, this adds a new channel for potential insider threats to company data and IPs. In this paper, we introduce a two-stage machine learning system to detect anomalies. In the first stage, we project the access logs of cloud file-sharing services onto relationship graphs and use three complementary graph-based unsupervised learning methods: OddBall, PageRank and Local Outlier Factor (LOF) to generate outlier indicators. In the second stage, we ensemble the outlier indicators and introduce the discrete wavelet transform (DWT) method, and propose a procedure to use wavelet coefficients with the Haar wavelet function to identify outliers for insider threat. The proposed system has been deployed in a real business environment, and demonstrated effectiveness by selected case studies.

Lin, L., Zhong, S., Jia, C., Chen, K..  2017.  Insider Threat Detection Based on Deep Belief Network Feature Representation. 2017 International Conference on Green Informatics (ICGI). :54–59.

Insider threat is a significant security risk for information system, and detection of insider threat is a major concern for information system organizers. Recently existing work mainly focused on the single pattern analysis of user single-domain behavior, which were not suitable for user behavior pattern analysis in multi-domain scenarios. However, the fusion of multi-domain irrelevant features may hide the existence of anomalies. Previous feature learning methods have relatively a large proportion of information loss in feature extraction. Therefore, this paper proposes a hybrid model based on the deep belief network (DBN) to detect insider threat. First, an unsupervised DBN is used to extract hidden features from the multi-domain feature extracted by the audit logs. Secondly, a One-Class SVM (OCSVM) is trained from the features learned by the DBN. The experimental results on the CERT dataset demonstrate that the DBN can be used to identify the insider threat events and it provides a new idea to feature processing for the insider threat detection.

2017-11-20
Yang, Chaofei, Wu, Chunpeng, Li, Hai, Chen, Yiran, Barnell, Mark, Wu, Qing.  2016.  Security challenges in smart surveillance systems and the solutions based on emerging nano-devices. 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). :1–6.

Modern smart surveillance systems can not only record the monitored environment but also identify the targeted objects and detect anomaly activities. These advanced functions are often facilitated by deep neural networks, achieving very high accuracy and large data processing throughput. However, inappropriate design of the neural network may expose such smart systems to the risks of leaking the target being searched or even the adopted learning model itself to attackers. In this talk, we will present the security challenges in the design of smart surveillance systems. We will also discuss some possible solutions that leverage the unique properties of emerging nano-devices, including the incurred design and performance cost and optimization methods for minimizing these overheads.

2017-02-23
Y. Cao, J. Yang.  2015.  "Towards Making Systems Forget with Machine Unlearning". 2015 IEEE Symposium on Security and Privacy. :463-480.

Today's systems produce a rapidly exploding amount of data, and the data further derives more data, forming a complex data propagation network that we call the data's lineage. There are many reasons that users want systems to forget certain data including its lineage. From a privacy perspective, users who become concerned with new privacy risks of a system often want the system to forget their data and lineage. From a security perspective, if an attacker pollutes an anomaly detector by injecting manually crafted data into the training data set, the detector must forget the injected data to regain security. From a usability perspective, a user can remove noise and incorrect entries so that a recommendation engine gives useful recommendations. Therefore, we envision forgetting systems, capable of forgetting certain data and their lineages, completely and quickly. This paper focuses on making learning systems forget, the process of which we call machine unlearning, or simply unlearning. We present a general, efficient unlearning approach by transforming learning algorithms used by a system into a summation form. To forget a training data sample, our approach simply updates a small number of summations – asymptotically faster than retraining from scratch. Our approach is general, because the summation form is from the statistical query learning in which many machine learning algorithms can be implemented. Our approach also applies to all stages of machine learning, including feature selection and modeling. Our evaluation, on four diverse learning systems and real-world workloads, shows that our approach is general, effective, fast, and easy to use.

2015-04-30
Yexing Li, Xinye Cai, Zhun Fan, Qingfu Zhang.  2014.  An external archive guided multiobjective evolutionary approach based on decomposition for continuous optimization. Evolutionary Computation (CEC), 2014 IEEE Congress on. :1124-1130.

In this paper, we propose a decomposition based multiobjective evolutionary algorithm that extracts information from an external archive to guide the evolutionary search for continuous optimization problem. The proposed algorithm used a mechanism to identify the promising regions(subproblems) through learning information from the external archive to guide evolutionary search process. In order to demonstrate the performance of the algorithm, we conduct experiments to compare it with other decomposition based approaches. The results validate that our proposed algorithm is very competitive.