Biblio

Found 773 results

Filters: Keyword is Training  [Clear All Filters]
2018-02-06
Scheitle, Q., Gasser, O., Rouhi, M., Carle, G..  2017.  Large-Scale Classification of IPv6-IPv4 Siblings with Variable Clock Skew. 2017 Network Traffic Measurement and Analysis Conference (TMA). :1–9.

Linking the growing IPv6 deployment to existing IPv4 addresses is an interesting field of research, be it for network forensics, structural analysis, or reconnaissance. In this work, we focus on classifying pairs of server IPv6 and IPv4 addresses as siblings, i.e., running on the same machine. Our methodology leverages active measurements of TCP timestamps and other network characteristics, which we measure against a diverse ground truth of 682 hosts. We define and extract a set of features, including estimation of variable (opposed to constant) remote clock skew. On these features, we train a manually crafted algorithm as well as a machine-learned decision tree. By conducting several measurement runs and training in cross-validation rounds, we aim to create models that generalize well and do not overfit our training data. We find both models to exceed 99% precision in train and test performance. We validate scalability by classifying 149k siblings in a large-scale measurement of 371k sibling candidates. We argue that this methodology, thoroughly cross-validated and likely to generalize well, can aid comparative studies of IPv6 and IPv4 behavior in the Internet. Striving for applicability and replicability, we release ready-to-use source code and raw data from our study.

2018-09-05
Turnley, J., Wachtel, A., Muñoz-Ramos, K., Hoffman, M., Gauthier, J., Speed, A., Kittinger, R..  2017.  Modeling human-technology interaction as a sociotechnical system of systems. 2017 12th System of Systems Engineering Conference (SoSE). :1–6.
As system of systems (SoS) models become increasingly complex and interconnected a new approach is needed to capture the effects of humans within the SoS. Many real-life events have shown the detrimental outcomes of failing to account for humans in the loop. This research introduces a novel and cross-disciplinary methodology for modeling humans interacting with technologies to perform tasks within an SoS specifically within a layered physical security system use case. Metrics and formulations developed for this new way of looking at SoS termed sociotechnical SoS allow for the quantification of the interplay of effectiveness and efficiency seen in detection theory to measure the ability of a physical security system to detect and respond to threats. This methodology has been applied to a notional representation of a small military Forward Operating Base (FOB) as a proof-of-concept.
2018-06-07
Aygun, R. C., Yavuz, A. G..  2017.  Network Anomaly Detection with Stochastically Improved Autoencoder Based Models. 2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud). :193–198.

Intrusion detection systems do not perform well when it comes to detecting zero-day attacks, therefore improving their performance in that regard is an active research topic. In this study, to detect zero-day attacks with high accuracy, we proposed two deep learning based anomaly detection models using autoencoder and denoising autoencoder respectively. The key factor that directly affects the accuracy of the proposed models is the threshold value which was determined using a stochastic approach rather than the approaches available in the current literature. The proposed models were tested using the KDDTest+ dataset contained in NSL-KDD, and we achieved an accuracy of 88.28% and 88.65% respectively. The obtained results show that, as a singular model, our proposed anomaly detection models outperform any other singular anomaly detection methods and they perform almost the same as the newly suggested hybrid anomaly detection models.

2017-12-20
Weedon, M., Tsaptsinos, D., Denholm-Price, J..  2017.  Random forest explorations for URL classification. 2017 International Conference On Cyber Situational Awareness, Data Analytics And Assessment (Cyber SA). :1–4.

Phishing is a major concern on the Internet today and many users are falling victim because of criminal's deceitful tactics. Blacklisting is still the most common defence users have against such phishing websites, but is failing to cope with the increasing number. In recent years, researchers have devised modern ways of detecting such websites using machine learning. One such method is to create machine learnt models of URL features to classify whether URLs are phishing. However, there are varying opinions on what the best approach is for features and algorithms. In this paper, the objective is to evaluate the performance of the Random Forest algorithm using a lexical only dataset. The performance is benchmarked against other machine learning algorithms and additionally against those reported in the literature. Initial results from experiments indicate that the Random Forest algorithm performs the best yielding an 86.9% accuracy.

2018-05-25
T. wei, Yanzhi Wang, Q. Zhu.  2017.  Deep reinforcement learning for building HVAC control. 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC). :1-6.
2017-12-28
Kumar, S. A. P., Bhargava, B., Macêdo, R., Mani, G..  2017.  Securing IoT-Based Cyber-Physical Human Systems against Collaborative Attacks. 2017 IEEE International Congress on Internet of Things (ICIOT). :9–16.

Security issues in the IoT based CPS are exacerbated with human participation in CPHS due to the vulnerabilities in both the technologies and the human involvement. A holistic framework to mitigate security threats in the IoT-based CPHS environment is presented to mitigate these issues. We have developed threat model involving human elements in the CPHS environment. Research questions, directions, and ideas with respect to securing IoT based CPHS against collaborative attacks are presented.

2018-04-02
Yousefi-Azar, M., Varadharajan, V., Hamey, L., Tupakula, U..  2017.  Autoencoder-Based Feature Learning for Cyber Security Applications. 2017 International Joint Conference on Neural Networks (IJCNN). :3854–3861.

This paper presents a novel feature learning model for cyber security tasks. We propose to use Auto-encoders (AEs), as a generative model, to learn latent representation of different feature sets. We show how well the AE is capable of automatically learning a reasonable notion of semantic similarity among input features. Specifically, the AE accepts a feature vector, obtained from cyber security phenomena, and extracts a code vector that captures the semantic similarity between the feature vectors. This similarity is embedded in an abstract latent representation. Because the AE is trained in an unsupervised fashion, the main part of this success comes from appropriate original feature set that is used in this paper. It can also provide more discriminative features in contrast to other feature engineering approaches. Furthermore, the scheme can reduce the dimensionality of the features thereby signicantly minimising the memory requirements. We selected two different cyber security tasks: networkbased anomaly intrusion detection and Malware classication. We have analysed the proposed scheme with various classifiers using publicly available datasets for network anomaly intrusion detection and malware classifications. Several appropriate evaluation metrics show improvement compared to prior results.

2018-04-04
Bao, D., Yang, F., Jiang, Q., Li, S., He, X..  2017.  Block RLS algorithm for surveillance video processing based on image sparse representation. 2017 29th Chinese Control And Decision Conference (CCDC). :2195–2200.

Block recursive least square (BRLS) algorithm for dictionary learning in compressed sensing system is developed for surveillance video processing. The new method uses image blocks directly and iteratively to train dictionaries via BRLS algorithm, which is different from classical methods that require to transform blocks to columns first and then giving all training blocks at one time. Since the background in surveillance video is almost fixed, the residual of foreground can be represented sparsely and reconstructed with background subtraction directly. The new method and framework are applied in real image and surveillance video processing. Simulation results show that the new method achieves better representation performance than classical ones in both image and surveillance video.

2018-02-28
Kaelbling, L. P., Lozano-Pérez, T..  2017.  Learning composable models of parameterized skills. 2017 IEEE International Conference on Robotics and Automation (ICRA). :886–893.

There has been a great deal of work on learning new robot skills, but very little consideration of how these newly acquired skills can be integrated into an overall intelligent system. A key aspect of such a system is compositionality: newly learned abilities have to be characterized in a form that will allow them to be flexibly combined with existing abilities, affording a (good!) combinatorial explosion in the robot's abilities. In this paper, we focus on learning models of the preconditions and effects of new parameterized skills, in a form that allows those actions to be combined with existing abilities by a generative planning and execution system.

2018-04-02
Alom, M. Z., Taha, T. M..  2017.  Network Intrusion Detection for Cyber Security on Neuromorphic Computing System. 2017 International Joint Conference on Neural Networks (IJCNN). :3830–3837.

In the paper, we demonstrate a neuromorphic cognitive computing approach for Network Intrusion Detection System (IDS) for cyber security using Deep Learning (DL). The algorithmic power of DL has been merged with fast and extremely power efficient neuromorphic processors for cyber security. In this implementation, the data has been numerical encoded to train with un-supervised deep learning techniques called Auto Encoder (AE) in the training phase. The generated weights of AE are used as initial weights for the supervised training phase using neural networks. The final weights are converted to discrete values using Discrete Vector Factorization (DVF) for generating crossbar weight, synaptic weights, and thresholds for neurons. Finally, the generated crossbar weights, synaptic weights, threshold, and leak values are mapped to crossbars and neurons. In the testing phase, the encoded test samples are converted to spiking form by using hybrid encoding technique. The model has been deployed and tested on the IBM Neurosynaptic Core Simulator (NSCS) and on actual IBM TrueNorth neurosynaptic chip. The experimental results show around 90.12% accuracy for network intrusion detection for cyber security on the physical neuromorphic chip. Furthermore, we have investigated the proposed system not only for detection of malicious packets but also for classifying specific types of attacks and achieved 81.31% recognition accuracy. The neuromorphic implementation provides incredible detection and classification accuracy for network intrusion detection with extremely low power.

2017-12-20
Le, T. A., Baydin, A. G., Zinkov, R., Wood, F..  2017.  Using synthetic data to train neural networks is model-based reasoning. 2017 International Joint Conference on Neural Networks (IJCNN). :3514–3521.
We draw a formal connection between using synthetic training data to optimize neural network parameters and approximate, Bayesian, model-based reasoning. In particular, training a neural network using synthetic data can be viewed as learning a proposal distribution generator for approximate inference in the synthetic-data generative model. We demonstrate this connection in a recognition task where we develop a novel Captcha-breaking architecture and train it using synthetic data, demonstrating both state-of-the-art performance and a way of computing task-specific posterior uncertainty. Using a neural network trained this way, we also demonstrate successful breaking of real-world Captchas currently used by Facebook and Wikipedia. Reasoning from these empirical results and drawing connections with Bayesian modeling, we discuss the robustness of synthetic data results and suggest important considerations for ensuring good neural network generalization when training with synthetic data.
2018-04-04
Babiker, M., Khalifa, O. O., Htike, K. K., Hassan, A., Zaharadeen, M..  2017.  Automated daily human activity recognition for video surveillance using neural network. 2017 IEEE 4th International Conference on Smart Instrumentation, Measurement and Application (ICSIMA). :1–5.

Surveillance video systems are gaining increasing attention in the field of computer vision due to its demands of users for the seek of security. It is promising to observe the human movement and predict such kind of sense of movements. The need arises to develop a surveillance system that capable to overcome the shortcoming of depending on the human resource to stay monitoring, observing the normal and suspect event all the time without any absent mind and to facilitate the control of huge surveillance system network. In this paper, an intelligent human activity system recognition is developed. Series of digital image processing techniques were used in each stage of the proposed system, such as background subtraction, binarization, and morphological operation. A robust neural network was built based on the human activities features database, which was extracted from the frame sequences. Multi-layer feed forward perceptron network used to classify the activities model in the dataset. The classification results show a high performance in all of the stages of training, testing and validation. Finally, these results lead to achieving a promising performance in the activity recognition rate.

2018-03-19
Ditzler, G., Prater, A..  2017.  Fine Tuning Lasso in an Adversarial Environment against Gradient Attacks. 2017 IEEE Symposium Series on Computational Intelligence (SSCI). :1–7.

Machine learning and data mining algorithms typically assume that the training and testing data are sampled from the same fixed probability distribution; however, this violation is often violated in practice. The field of domain adaptation addresses the situation where this assumption of a fixed probability between the two domains is violated; however, the difference between the two domains (training/source and testing/target) may not be known a priori. There has been a recent thrust in addressing the problem of learning in the presence of an adversary, which we formulate as a problem of domain adaption to build a more robust classifier. This is because the overall security of classifiers and their preprocessing stages have been called into question with the recent findings of adversaries in a learning setting. Adversarial training (and testing) data pose a serious threat to scenarios where an attacker has the opportunity to ``poison'' the training or ``evade'' on the testing data set(s) in order to achieve something that is not in the best interest of the classifier. Recent work has begun to show the impact of adversarial data on several classifiers; however, the impact of the adversary on aspects related to preprocessing of data (i.e., dimensionality reduction or feature selection) has widely been ignored in the revamp of adversarial learning research. Furthermore, variable selection, which is a vital component to any data analysis, has been shown to be particularly susceptible under an attacker that has knowledge of the task. In this work, we explore avenues for learning resilient classification models in the adversarial learning setting by considering the effects of adversarial data and how to mitigate its effects through optimization. Our model forms a single convex optimization problem that uses the labeled training data from the source domain and known- weaknesses of the model for an adversarial component. We benchmark the proposed approach on synthetic data and show the trade-off between classification accuracy and skew-insensitive statistics.

2018-06-20
Hassen, M., Carvalho, M. M., Chan, P. K..  2017.  Malware classification using static analysis based features. 2017 IEEE Symposium Series on Computational Intelligence (SSCI). :1–7.

Anti-virus vendors receive hundreds of thousands of malware to be analysed each day. Some are new malware while others are variations or evolutions of existing malware. Because analyzing each malware sample by hand is impossible, automated techniques to analyse and categorize incoming samples are needed. In this work, we explore various machine learning features extracted from malware samples through static analysis for classification of malware binaries into already known malware families. We present a new feature based on control statement shingling that has a comparable accuracy to ordinary opcode n-gram based features while requiring smaller dimensions. This, in turn, results in a shorter training time.

2018-04-02
Al-Zewairi, M., Almajali, S., Awajan, A..  2017.  Experimental Evaluation of a Multi-Layer Feed-Forward Artificial Neural Network Classifier for Network Intrusion Detection System. 2017 International Conference on New Trends in Computing Sciences (ICTCS). :167–172.

Deep Learning has been proven more effective than conventional machine-learning algorithms in solving classification problem with high dimensionality and complex features, especially when trained with big data. In this paper, a deep learning binomial classifier for Network Intrusion Detection System is proposed and experimentally evaluated using the UNSW-NB15 dataset. Three different experiments were executed in order to determine the optimal activation function, then to select the most important features and finally to test the proposed model on unseen data. The evaluation results demonstrate that the proposed classifier outperforms other models in the literature with 98.99% accuracy and 0.56% false alarm rate on unseen data.

2018-02-14
Zhao, J., Shetty, S., Pan, J. W..  2017.  Feature-based transfer learning for network security. MILCOM 2017 - 2017 IEEE Military Communications Conference (MILCOM). :17–22.

New and unseen network attacks pose a great threat to the signature-based detection systems. Consequently, machine learning-based approaches are designed to detect attacks, which rely on features extracted from network data. The problem is caused by different distribution of features in the training and testing datasets, which affects the performance of the learned models. Moreover, generating labeled datasets is very time-consuming and expensive, which undercuts the effectiveness of supervised learning approaches. In this paper, we propose using transfer learning to detect previously unseen attacks. The main idea is to learn the optimized representation to be invariant to the changes of attack behaviors from labeled training sets and non-labeled testing sets, which contain different types of attacks and feed the representation to a supervised classifier. To the best of our knowledge, this is the first effort to use a feature-based transfer learning technique to detect unseen variants of network attacks. Furthermore, this technique can be used with any common base classifier. We evaluated the technique on publicly available datasets, and the results demonstrate the effectiveness of transfer learning to detect new network attacks.

2017-12-28
Godfrey, L. B., Gashler, M. S..  2017.  Neural decomposition of time-series data. 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC). :2796–2801.

We present a neural network technique for the analysis and extrapolation of time-series data called Neural Decomposition (ND). Units with a sinusoidal activation function are used to perform a Fourier-like decomposition of training samples into a sum of sinusoids, augmented by units with nonperiodic activation functions to capture linear trends and other nonperiodic components. We show how careful weight initialization can be combined with regularization to form a simple model that generalizes well. Our method generalizes effectively on the Mackey-Glass series, a dataset of unemployment rates as reported by the U.S. Department of Labor Statistics, a time-series of monthly international airline passengers, and an unevenly sampled time-series of oxygen isotope measurements from a cave in north India. We find that ND outperforms popular time-series forecasting techniques including LSTM, echo state networks, (S)ARIMA, and SVR with a radial basis function.

2018-02-14
Stubbs, J. J., Birch, G. C., Woo, B. L., Kouhestani, C. G..  2017.  Physical security assessment with convolutional neural network transfer learning. 2017 International Carnahan Conference on Security Technology (ICCST). :1–6.

Deep learning techniques have demonstrated the ability to perform a variety of object recognition tasks using visible imager data; however, deep learning has not been implemented as a means to autonomously detect and assess targets of interest in a physical security system. We demonstrate the use of transfer learning on a convolutional neural network (CNN) to significantly reduce training time while keeping detection accuracy of physical security relevant targets high. Unlike many detection algorithms employed by video analytics within physical security systems, this method does not rely on temporal data to construct a background scene; targets of interest can halt motion indefinitely and still be detected by the implemented CNN. A key advantage of using deep learning is the ability for a network to improve over time. Periodic retraining can lead to better detection and higher confidence rates. We investigate training data size versus CNN test accuracy using physical security video data. Due to the large number of visible imagers, significant volume of data collected daily, and currently deployed human in the loop ground truth data, physical security systems present a unique environment that is well suited for analysis via CNNs. This could lead to the creation of algorithmic element that reduces human burden and decreases human analyzed nuisance alarms.

2018-05-30
Howard, M., Pfeffer, A., Dalai, M., Reposa, M..  2017.  Predicting Signatures of Future Malware Variants. 2017 12th International Conference on Malicious and Unwanted Software (MALWARE). :126–132.
One of the challenges of malware defense is that the attacker has the advantage over the defender. In many cases, an attack is successful and causes damage before the defender can even begin to prepare a defense. The ability to anticipate attacks and prepare defenses before they occur would be a significant scientific and technological development with practical applications in cybersecurity. In this paper, we present a method to augment machine learning-based malware detection systems by predicting signatures of future malware variants and injecting these variants into the defensive system as a vaccine. Our method uses deep learning to learn patterns of malware evolution from family histories. These evolution patterns are then used to predict future family developments. Our experiments show that a detection system augmented with these future malware signatures is able to detect future malware variants that could not be detected by the detection system alone. In particular, it detected 11 new malware variants without increasing false positives, while providing up to 5 months of lead time between prediction and attack.
2018-08-06
Z. Abaid, M. A. Kaafar, S. Jha.  2017.  Quantifying the impact of adversarial evasion attacks on machine learning based android malware classifiers. 2017 IEEE 16th International Symposium on Network Computing and Applications (NCA). :1-10.
With the proliferation of Android-based devices, malicious apps have increasingly found their way to user devices. Many solutions for Android malware detection rely on machine learning; although effective, these are vulnerable to attacks from adversaries who wish to subvert these algorithms and allow malicious apps to evade detection. In this work, we present a statistical analysis of the impact of adversarial evasion attacks on various linear and non-linear classifiers, using a recently proposed Android malware classifier as a case study. We systematically explore the complete space of possible attacks varying in the adversary's knowledge about the classifier; our results show that it is possible to subvert linear classifiers (Support Vector Machines and Logistic Regression) by perturbing only a few features of malicious apps, with more knowledgeable adversaries degrading the classifier's detection rate from 100% to 0% and a completely blind adversary able to lower it to 12%. We show non-linear classifiers (Random Forest and Neural Network) to be more resilient to these attacks. We conclude our study with recommendations for designing classifiers to be more robust to the attacks presented in our work.
2018-02-15
Wang, C., Lizana, F. R., Li, Z., Peterchev, A. V., Goetz, S. M..  2017.  Submodule short-circuit fault diagnosis based on wavelet transform and support vector machines for modular multilevel converter with series and parallel connectivity. IECON 2017 - 43rd Annual Conference of the IEEE Industrial Electronics Society. :3239–3244.

The modular multilevel converter with series and parallel connectivity was shown to provide advantages in several industrial applications. Its reliability largely depends on the absence of failures in the power semiconductors. We propose and analyze a fault-diagnosis technique to identify shorted switches based on features generated through wavelet transform of the converter output and subsequent classification in support vector machines. The multi-class support vector machine is trained with multiple recordings of the output of each fault condition as well as the converter under normal operation. Simulation results reveal that the proposed method has high classification latency and high robustness. Except for the monitoring of the output, which is required for the converter control in any case, this method does not require additional module sensors.

2018-11-19
Huang, X., Belongie, S..  2017.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization. 2017 IEEE International Conference on Computer Vision (ICCV). :1510–1519.

Gatys et al. recently introduced a neural algorithm that renders a content image in the style of another image, achieving so-called style transfer. However, their framework requires a slow iterative optimization process, which limits its practical application. Fast approximations with feed-forward neural networks have been proposed to speed up neural style transfer. Unfortunately, the speed improvement comes at a cost: the network is usually tied to a fixed set of styles and cannot adapt to arbitrary new styles. In this paper, we present a simple yet effective approach that for the first time enables arbitrary style transfer in real-time. At the heart of our method is a novel adaptive instance normalization (AdaIN) layer that aligns the mean and variance of the content features with those of the style features. Our method achieves speed comparable to the fastest existing approach, without the restriction to a pre-defined set of styles. In addition, our approach allows flexible user controls such as content-style trade-off, style interpolation, color & spatial controls, all using a single feed-forward neural network.

Chen, D., Liao, J., Yuan, L., Yu, N., Hua, G..  2017.  Coherent Online Video Style Transfer. 2017 IEEE International Conference on Computer Vision (ICCV). :1114–1123.

Training a feed-forward network for the fast neural style transfer of images has proven successful, but the naive extension of processing videos frame by frame is prone to producing flickering results. We propose the first end-to-end network for online video style transfer, which generates temporally coherent stylized video sequences in near realtime. Two key ideas include an efficient network by incorporating short-term coherence, and propagating short-term coherence to long-term, which ensures consistency over a longer period of time. Our network can incorporate different image stylization networks and clearly outperforms the per-frame baseline both qualitatively and quantitatively. Moreover, it can achieve visually comparable coherence to optimization-based video style transfer, but is three orders of magnitude faster.

2018-05-24
Sallam, A., Bertino, E..  2017.  Detection of Temporal Insider Threats to Relational Databases. 2017 IEEE 3rd International Conference on Collaboration and Internet Computing (CIC). :406–415.

The mitigation of insider threats against databases is a challenging problem as insiders often have legitimate access privileges to sensitive data. Therefore, conventional security mechanisms, such as authentication and access control, may be insufficient for the protection of databases against insider threats and need to be complemented with techniques that support real-time detection of access anomalies. The existing real-time anomaly detection techniques consider anomalies in references to the database entities and the amounts of accessed data. However, they are unable to track the access frequencies. According to recent security reports, an increase in the access frequency by an insider is an indicator of a potential data misuse and may be the result of malicious intents for stealing or corrupting the data. In this paper, we propose techniques for tracking users' access frequencies and detecting anomalous related activities in real-time. We present detailed algorithms for constructing accurate profiles that describe the access patterns of the database users and for matching subsequent accesses by these users to the profiles. Our methods report and log mismatches as anomalies that may need further investigation. We evaluated our techniques on the OLTP-Benchmark. The results of the evaluation indicate that our techniques are very effective in the detection of anomalies.

2018-06-20
Chakraborty, S., Stokes, J. W., Xiao, L., Zhou, D., Marinescu, M., Thomas, A..  2017.  Hierarchical learning for automated malware classification. MILCOM 2017 - 2017 IEEE Military Communications Conference (MILCOM). :23–28.

Despite widespread use of commercial anti-virus products, the number of malicious files detected on home and corporate computers continues to increase at a significant rate. Recently, anti-virus companies have started investing in machine learning solutions to augment signatures manually designed by analysts. A malicious file's determination is often represented as a hierarchical structure consisting of a type (e.g. Worm, Backdoor), a platform (e.g. Win32, Win64), a family (e.g. Rbot, Rugrat) and a family variant (e.g. A, B). While there has been substantial research in automated malware classification, the aforementioned hierarchical structure, which can provide additional information to the classification models, has been ignored. In this paper, we propose the novel idea and study the performance of employing hierarchical learning algorithms for automated classification of malicious files. To the best of our knowledge, this is the first research effort which incorporates the hierarchical structure of the malware label in its automated classification and in the security domain, in general. It is important to note that our method does not require any additional effort by analysts because they typically assign these hierarchical labels today. Our empirical results on a real world, industrial-scale malware dataset of 3.6 million files demonstrate that incorporation of the label hierarchy achieves a significant reduction of 33.1% in the binary error rate as compared to a non-hierarchical classifier which is traditionally used in such problems.