Biblio
Recent approaches have proven the effectiveness of local outlier factor-based outlier detection when applied over traffic flow probability distributions. However, these approaches used distance metrics based on the Bhattacharyya coefficient when calculating probability distribution similarity. Consequently, the limited expressiveness of the Bhattacharyya coefficient restricted the accuracy of the methods. The crucial deficiency of the Bhattacharyya distance metric is its inability to compare distributions with non-overlapping sample spaces over the domain of natural numbers. Traffic flow intensity varies greatly, which results in numerous non-overlapping sample spaces, rendering metrics based on the Bhattacharyya coefficient inappropriate. In this work, we address this issue by exploring alternative distance metrics and showing their applicability in a massive real-life traffic flow data set from 26 vital intersections in The Hague. The results on these data collected from 272 sensors for more than two years show various advantages of the Earth Mover's distance both in effectiveness and efficiency.
Short-term load forecasting systems for power grids have demonstrated high accuracy and have been widely employed for commercial use. However, classic load forecasting systems, which are based on statistical methods, are subject to vulnerability from training data poisoning. In this paper, we demonstrate a data poisoning strategy that effectively corrupts the forecasting model even in the presence of outlier detection. To the best of our knowledge, poisoning attack on short-term load forecasting with outlier detection has not been studied in previous works. Our method applies to several forecasting models, including the most widely-adapted and best-performing ones, such as multiple linear regression (MLR) and neural network (NN) models. Starting with the MLR model, we develop a novel closed-form solution to quickly estimate the new MLR model after a round of data poisoning without retraining. We then employ line search and simulated annealing to find the poisoning attack solution. Furthermore, we use the MLR attacking solution to generate a numerical solution for other models, such as NN. The effectiveness of our algorithm has been tested on the Global Energy Forecasting Competition (GEFCom2012) data set with the presence of outlier detection.
Imposing security in MANET is very challenging and hot topic of research science last two decades because of its wide applicability in applications like defense. Number of efforts has been made in this direction. But available security algorithms, methods, models and framework may not completely solve this problem. Motivated from various existing security methods and outlier detection, in this paper novel simple but efficient outlier detection scheme based security algorithm is proposed to protect the Ad hoc on demand distance vector (AODV) reactive routing protocol from Black hole attack in mobile ad hoc environment. Simulation results obtained from network simulator tool evident the simplicity, robustness and effectiveness of the proposed algorithm over the original AODV protocol and existing methods.
This work concerns distributed consensus algorithms and application to a network intrusion detection system (NIDS) [21]. We consider the problem of defending the system against multiple data falsification attacks (Byzantine attacks), a vulnerability of distributed peer-to-peer consensus algorithms that has not been widely addressed in its practicality. We consider both naive (independent) and colluding attackers. We test three defense strategy implementations, two classified as outlier detection methods and one reputation-based method. We have narrowed our attention to outlier and reputation-based methods because they are relatively light computationally speaking. We have left out control theoretic methods which are likely the most effective methods, however their computational cost increase rapidly with the number of attackers. We compare the efficiency of these three implementations for their computational cost, detection performance, convergence behavior and possible impacts on the intrusion detection accuracy of the NIDS. Tests are performed based on simulations of distributed denial of service attacks using the KSL-KDD data set.
Outlier detection is a fundamental data science task with applications ranging from data cleaning to network security. Recently, a new class of outlier detection algorithms has emerged, called contextual outlier detection, and has shown improved performance when studying anomalous behavior in a specific context. However, as we point out in this article, such approaches have limited applicability in situations where the context is sparse (i.e., lacking a suitable frame of reference). Moreover, approaches developed to date do not scale to large datasets. To address these problems, here we propose a novel and robust approach alternative to the state-of-the-art called RObust Contextual Outlier Detection (ROCOD). We utilize a local and global behavioral model based on the relevant contexts, which is then integrated in a natural and robust fashion. We run ROCOD on both synthetic and real-world datasets and demonstrate that it outperforms other competitive baselines on the axes of efficacy and efficiency. We also drill down and perform a fine-grained analysis to shed light on the rationale for the performance gains of ROCOD and reveal its effectiveness when handling objects with sparse contexts.
Since the number of cyber attacks by insider threats and the damage caused by them has been increasing over the last years, organizations are in need for specific security solutions to counter these threats. To limit the damage caused by insider threats, the timely detection of erratic system behavior and malicious activities is of primary importance. We observed a major paradigm shift towards anomaly-focused detection mechanisms, which try to establish a baseline of system behavior – based on system logging data – and report any deviations from this baseline. While these approaches are promising, they usually have to cope with scalability issues. As the amount of log data generated during IT operations is exponentially growing, high-performance security solutions are required that can handle this huge amount of data in real time. In this paper, we demonstrate how high-performance bioinformatics tools can be leveraged to tackle this issue, and we demonstrate their application to log data for outlier detection, to timely detect anomalous system behavior that points to insider attacks.
Keystroke dynamics is a form of behavioral biometrics that can be used for continuous authentication of computer users. Many classifiers have been proposed for the analysis of acquired user patterns and verification of users at computer terminals. The underlying machine learning methods that use Gaussian density estimator for outlier detection typically assume that the digraph patterns in keystroke data are generated from a single Gaussian distribution. In this paper, we relax this assumption by allowing digraphs to fit more than one distribution via the Gaussian Mixture Model (GMM). We have conducted an experiment with a public data set collected in a controlled environment. Out of 30 users with dynamic text, we obtain 0.08% Equal Error Rate (EER) with 2 components by using GMM, while pure Gaussian yields 1.3% EER for the same data set (an improvement of EER by 93.8%). Our results show that GMM can recognize keystroke dynamics more precisely and authenticate users with higher confidence level.
Outlier detection is a fundamental data science task with applications ranging from data cleaning to network security. Recently, a new class of outlier detection algorithms has emerged, called contextual outlier detection, and has shown improved performance when studying anomalous behavior in a specific context. However, as we point out in this article, such approaches have limited applicability in situations where the context is sparse (i.e., lacking a suitable frame of reference). Moreover, approaches developed to date do not scale to large datasets. To address these problems, here we propose a novel and robust approach alternative to the state-of-the-art called RObust Contextual Outlier Detection (ROCOD). We utilize a local and global behavioral model based on the relevant contexts, which is then integrated in a natural and robust fashion. We run ROCOD on both synthetic and real-world datasets and demonstrate that it outperforms other competitive baselines on the axes of efficacy and efficiency. We also drill down and perform a fine-grained analysis to shed light on the rationale for the performance gains of ROCOD and reveal its effectiveness when handling objects with sparse contexts.
Misalignment angles estimation of strapdown inertial navigation system (INS) using global positioning system (GPS) data is highly affected by measurement noises, especially with noises displaying time varying statistical properties. Hence, adaptive filtering approach is recommended for the purpose of improving the accuracy of in-motion alignment. In this paper, a simplified form of Celso's adaptive stochastic filtering is derived and applied to estimate both the INS error states and measurement noise statistics. To detect and bound the influence of outliers in INS/GPS integration, outlier detection based on jerk tracking model is also proposed. The accuracy and validity of the proposed algorithm is tested through ground based navigation experiments.
To improve comprehensive performance of denoising range images, an impulsive noise (IN) denoising method with variable windows is proposed in this paper. Founded on several discriminant criteria, the principles of dropout IN detection and outlier IN detection are provided. Subsequently, a nearest non-IN neighbors searching process and an Index Distance Weighted Mean filter is combined for IN denoising. As key factors of adapatablity of the proposed denoising method, the sizes of two windows for outlier INs detection and INs denoising are investigated. Originated from a theoretical model of invader occlusion, variable window is presented for adapting window size to dynamic environment of each point, accompanying with practical criteria of adaptive variable window size determination. Experiments on real range images of multi-line surface are proceeded with evaluations in terms of computational complexity and quality assessment with comparison analysis among a few other popular methods. It is indicated that the proposed method can detect the impulsive noises with high accuracy, meanwhile, denoise them with strong adaptability with the help of variable window.