Visible to the public Biblio

Filters: Keyword is online learning  [Clear All Filters]
2023-06-22
Seetharaman, Sanjay, Malaviya, Shubham, Vasu, Rosni, Shukla, Manish, Lodha, Sachin.  2022.  Influence Based Defense Against Data Poisoning Attacks in Online Learning. 2022 14th International Conference on COMmunication Systems & NETworkS (COMSNETS). :1–6.
Data poisoning is a type of adversarial attack on training data where an attacker manipulates a fraction of data to degrade the performance of machine learning model. There are several known defensive mechanisms for handling offline attacks, however defensive measures for online learning, where data points arrive sequentially, have not garnered similar interest. In this work, we propose a defense mechanism to minimize the degradation caused by the poisoned training data on a learner's model in an online setup. Our proposed method utilizes an influence function which is a classic technique in robust statistics. Further, we supplement it with the existing data sanitization methods for filtering out some of the poisoned data points. We study the effectiveness of our defense mechanism on multiple datasets and across multiple attack strategies against an online learner.
ISSN: 2155-2509
2023-03-31
Xu, Zichuan, Ren, Wenhao, Liang, Weifa, Xu, Wenzheng, Xia, Qiufen, Zhou, Pan, Li, Mingchu.  2022.  Schedule or Wait: Age-Minimization for IoT Big Data Processing in MEC via Online Learning. IEEE INFOCOM 2022 - IEEE Conference on Computer Communications. :1809–1818.
The age of data (AoD) is identified as one of the most novel and important metrics to measure the quality of big data analytics for Internet-of-Things (IoT) applications. Meanwhile, mobile edge computing (MEC) is envisioned as an enabling technology to minimize the AoD of IoT applications by processing the data in edge servers close to IoT devices. In this paper, we study the AoD minimization problem for IoT big data processing in MEC networks. We first propose an exact solution for the problem by formulating it as an Integer Linear Program (ILP). We then propose an efficient heuristic for the offline AoD minimization problem. We also devise an approximation algorithm with a provable approximation ratio for a special case of the problem, by leveraging the parametric rounding technique. We thirdly develop an online learning algorithm with a bounded regret for the online AoD minimization problem under dynamic arrivals of IoT requests and uncertain network delay assumptions, by adopting the Multi-Armed Bandit (MAB) technique. We finally evaluate the performance of the proposed algorithms by extensive simulations and implementations in a real test-bed. Results show that the proposed algorithms outperform existing approaches by reducing the AoD around 10%.
ISSN: 2641-9874
2023-01-06
Zhu, Yanxu, Wen, Hong, Zhang, Peng, Han, Wen, Sun, Fan, Jia, Jia.  2022.  Poisoning Attack against Online Regression Learning with Maximum Loss for Edge Intelligence. 2022 International Conference on Computing, Communication, Perception and Quantum Technology (CCPQT). :169—173.
Recent trends in the convergence of edge computing and artificial intelligence (AI) have led to a new paradigm of “edge intelligence”, which are more vulnerable to attack such as data and model poisoning and evasion of attacks. This paper proposes a white-box poisoning attack against online regression model for edge intelligence environment, which aim to prepare the protection methods in the future. Firstly, the new method selects data points from original stream with maximum loss by two selection strategies; Secondly, it pollutes these points with gradient ascent strategy. At last, it injects polluted points into original stream being sent to target model to complete the attack process. We extensively evaluate our proposed attack on open dataset, the results of which demonstrate the effectiveness of the novel attack method and the real implications of poisoning attack in a case study electric energy prediction application.
2021-01-11
Jiang, P., Liao, S..  2020.  Differential Privacy Online Learning Based on the Composition Theorem. 2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC). :200–203.
Privacy protection is becoming more and more important in the era of big data. Differential privacy is a rigorous and provable privacy protection method that can protect privacy for a single piece of data. But existing differential privacy online learning methods have great limitations in the scope of application and accuracy. Aiming at this problem, we propose a more general and accurate algorithm, named DPOL-CT, for differential privacy online learning. We first distinguish the difference in differential privacy protection between offline learning and online learning. Then we prove that the DPOL-CT algorithm achieves (∊, δ)-differential privacy for online learning under the Gaussian, the Laplace and the Staircase mechanisms and enjoys a sublinear expected regret bound. We further discuss the trade-off between the differential privacy level and the regret bound. Theoretical analysis and experimental results show that the DPOL-CT algorithm has good performance guarantees.
2020-11-17
Conway, A. E., Wang, M., Ljuca, E., Lebling, P. D..  2019.  A Dynamic Transport Overlay System for Mission-Oriented Dispersed Computing Over IoBT. MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM). :815—820.

A dynamic overlay system is presented for supporting transport service needs of dispersed computing applications for moving data and/or code between network computation points and end-users in IoT or IoBT. The Network Backhaul Layered Architecture (Nebula) system combines network discovery and QoS monitoring, dynamic path optimization, online learning, and per-hop tunnel transport protocol optimization and synthesis over paths, to carry application traffic flows transparently over overlay tunnels. An overview is provided of Nebula's overlay system, software architecture, API, and implementation in the NRL CORE network emulator. Experimental emulation results demonstrate the performance benefits that Nebula provides under challenging networking conditions.

2019-12-16
Hou, Ming, Li, Dequan, Wu, Xiongjun, Shen, Xiuyu.  2019.  Differential Privacy of Online Distributed Optimization under Adversarial Nodes. 2019 Chinese Control Conference (CCC). :2172-2177.

Nowadays, many applications involve big data and big data analysis methods appear in many fields. As a preliminary attempt to solve the challenge of big data analysis, this paper presents a distributed online learning algorithm based on differential privacy. Since online learning can effectively process sensitive data, we introduce the concept of differential privacy in distributed online learning algorithms, with the aim at ensuring data privacy during online learning to prevent adversarial nodes from inferring any important data information. In particular, for different adversary models, we consider different type graphs to tolerate a limited number of adversaries near each regular node or tolerate a global limited number of adversaries.

2019-12-10
Tai, Kai Sheng, Sharan, Vatsal, Bailis, Peter, Valiant, Gregory.  2018.  Sketching Linear Classifiers over Data Streams. Proceedings of the 2018 International Conference on Management of Data. :757-772.

We introduce a new sub-linear space sketch—the Weight-Median Sketch—for learning compressed linear classifiers over data streams while supporting the efficient recovery of large-magnitude weights in the model. This enables memory-limited execution of several statistical analyses over streams, including online feature selection, streaming data explanation, relative deltoid detection, and streaming estimation of pointwise mutual information. Unlike related sketches that capture the most frequently-occurring features (or items) in a data stream, the Weight-Median Sketch captures the features that are most discriminative of one stream (or class) compared to another. The Weight-Median Sketch adopts the core data structure used in the Count-Sketch, but, instead of sketching counts, it captures sketched gradient updates to the model parameters. We provide a theoretical analysis that establishes recovery guarantees for batch and online learning, and demonstrate empirical improvements in memory-accuracy trade-offs over alternative memory-budgeted methods, including count-based sketches and feature hashing.

2019-08-12
Ma, C., Yang, X., Wang, H..  2018.  Randomized Online CP Decomposition. 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI). :414-419.

CANDECOMP/PARAFAC (CP) decomposition has been widely used to deal with multi-way data. For real-time or large-scale tensors, based on the ideas of randomized-sampling CP decomposition algorithm and online CP decomposition algorithm, a novel CP decomposition algorithm called randomized online CP decomposition (ROCP) is proposed in this paper. The proposed algorithm can avoid forming full Khatri-Rao product, which leads to boost the speed largely and reduce memory usage. The experimental results on synthetic data and real-world data show the ROCP algorithm is able to cope with CP decomposition for large-scale tensors with arbitrary number of dimensions. In addition, ROCP can reduce the computing time and memory usage dramatically, especially for large-scale tensors.

2019-02-13
Ammar, M., Washha, M., Crispo, B..  2018.  WISE: Lightweight Intelligent Swarm Attestation Scheme for IoT (The Verifier’s Perspective). 2018 14th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob). :1–8.
The growing pervasiveness of Internet of Things (IoT) expands the attack surface by connecting more and more attractive attack targets, i.e. embedded devices, to the Internet. One key component in securing these devices is software integrity checking, which typically attained with Remote Attestation (RA). RA is realized as an interactive protocol, whereby a trusted party, verifier, verifies the software integrity of a potentially compromised remote device, prover. In the vast majority of IoT applications, smart devices operate in swarms, thus triggering the need for efficient swarm attestation schemes.In this paper, we present WISE, the first intelligent swarm attestation protocol that aims to minimize the communication overhead while preserving an adequate level of security. WISE depends on a resource-efficient smart broadcast authentication scheme where devices are organized in fine-grained multi-clusters, and whenever needed, the most likely compromised devices are attested. The candidate devices are selected intelligently taking into account the attestation history and the diverse characteristics (and constraints) of each device in the swarm. We show that WISE is very suitable for resource-constrained embedded devices, highly efficient and scalable in heterogenous IoT networks, and offers an adjustable level of security.
2019-02-08
Ivanova, M., Durcheva, M., Baneres, D., Rodríguez, M. E..  2018.  eAssessment by Using a Trustworthy System in Blended and Online Institutions. 2018 17th International Conference on Information Technology Based Higher Education and Training (ITHET). :1-7.

eAssessment uses technology to support online evaluation of students' knowledge and skills. However, challenging problems must be addressed such as trustworthiness among students and teachers in blended and online settings. The TeSLA system proposes an innovative solution to guarantee correct authentication of students and to prove the authorship of their assessment tasks. Technologically, the system is based on the integration of five instruments: face recognition, voice recognition, keystroke dynamics, forensic analysis, and plagiarism. The paper aims to analyze and compare the results achieved after the second pilot performed in an online and a blended university revealing the realization of trust-driven solutions for eAssessment.

2018-03-26
Hu, Zhisheng, Zhu, Minghui, Liu, Peng.  2017.  Online Algorithms for Adaptive Cyber Defense on Bayesian Attack Graphs. Proceedings of the 2017 Workshop on Moving Target Defense. :99–109.

Emerging zero-day vulnerabilities in information and communications technology systems make cyber defenses very challenging. In particular, the defender faces uncertainties of; e.g., system states and the locations and the impacts of vulnerabilities. In this paper, we study the defense problem on a computer network that is modeled as a partially observable Markov decision process on a Bayesian attack graph. We propose online algorithms which allow the defender to identify effective defense policies when utility functions are unknown a priori. The algorithm performance is verified via numerical simulations based on real-world attacks.

2017-09-15
Ghassemi, Mohsen, Sarwate, Anand D., Wright, Rebecca N..  2016.  Differentially Private Online Active Learning with Applications to Anomaly Detection. Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security. :117–128.

In settings where data instances are generated sequentially or in streaming fashion, online learning algorithms can learn predictors using incremental training algorithms such as stochastic gradient descent. In some security applications such as training anomaly detectors, the data streams may consist of private information or transactions and the output of the learning algorithms may reveal information about the training data. Differential privacy is a framework for quantifying the privacy risk in such settings. This paper proposes two differentially private strategies to mitigate privacy risk when training a classifier for anomaly detection in an online setting. The first is to use a randomized active learning heuristic to screen out uninformative data points in the stream. The second is to use mini-batching to improve classifier performance. Experimental results show how these two strategies can trade off privacy, label complexity, and generalization performance.

2017-09-06
C. Theisen, L. Williams, K. Oliver, E. Murphy-Hill.  2016.  Software Security Education at Scale. 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C). :346-355.

Massively Open Online Courses (MOOCs) provide a unique opportunity to reach out to students who would not normally be reached by alleviating the need to be physically present in the classroom. However, teaching software security coursework outside of a classroom setting can be challenging. What are the challenges when converting security material from an on-campus course to the MOOC format? The goal of this research is to assist educators in constructing software security coursework by providing a comparison of classroom courses and MOOCs. In this work, we compare demographic information, student motivations, and student results from an on-campus software security course and a MOOC version of the same course. We found that the two populations of students differed, with the MOOC reaching a more diverse set of students than the on-campus course. We found that students in the on-campus course had higher quiz scores, on average, than students in the MOOC. Finally, we document our experience running the courses and what we would do differently to assist future educators constructing similar MOOC's.

2017-07-24
Ghassemi, Mohsen, Sarwate, Anand D., Wright, Rebecca N..  2016.  Differentially Private Online Active Learning with Applications to Anomaly Detection. Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security. :117–128.

In settings where data instances are generated sequentially or in streaming fashion, online learning algorithms can learn predictors using incremental training algorithms such as stochastic gradient descent. In some security applications such as training anomaly detectors, the data streams may consist of private information or transactions and the output of the learning algorithms may reveal information about the training data. Differential privacy is a framework for quantifying the privacy risk in such settings. This paper proposes two differentially private strategies to mitigate privacy risk when training a classifier for anomaly detection in an online setting. The first is to use a randomized active learning heuristic to screen out uninformative data points in the stream. The second is to use mini-batching to improve classifier performance. Experimental results show how these two strategies can trade off privacy, label complexity, and generalization performance.

2015-05-05
Yanfei Guo, Lama, P., Changjun Jiang, Xiaobo Zhou.  2014.  Automated and Agile Server ParameterTuning by Coordinated Learning and Control. Parallel and Distributed Systems, IEEE Transactions on. 25:876-886.

Automated server parameter tuning is crucial to performance and availability of Internet applications hosted in cloud environments. It is challenging due to high dynamics and burstiness of workloads, multi-tier service architecture, and virtualized server infrastructure. In this paper, we investigate automated and agile server parameter tuning for maximizing effective throughput of multi-tier Internet applications. A recent study proposed a reinforcement learning based server parameter tuning approach for minimizing average response time of multi-tier applications. Reinforcement learning is a decision making process determining the parameter tuning direction based on trial-and-error, instead of quantitative values for agile parameter tuning. It relies on a predefined adjustment value for each tuning action. However it is nontrivial or even infeasible to find an optimal value under highly dynamic and bursty workloads. We design a neural fuzzy control based approach that combines the strengths of fast online learning and self-adaptiveness of neural networks and fuzzy control. Due to the model independence, it is robust to highly dynamic and bursty workloads. It is agile in server parameter tuning due to its quantitative control outputs. We implemented the new approach on a testbed of virtualized data center hosting RUBiS and WikiBench benchmark applications. Experimental results demonstrate that the new approach significantly outperforms the reinforcement learning based approach for both improving effective system throughput and minimizing average response time.