Biblio
Microsoft's PowerShell is a command-line shell and scripting language that is installed by default on Windows machines. Based on Microsoft's .NET framework, it includes an interface that allows programmers to access operating system services. While PowerShell can be configured by administrators for restricting access and reducing vulnerabilities, these restrictions can be bypassed. Moreover, PowerShell commands can be easily generated dynamically, executed from memory, encoded and obfuscated, thus making the logging and forensic analysis of code executed by PowerShell challenging. For all these reasons, PowerShell is increasingly used by cybercriminals as part of their attacks' tool chain, mainly for downloading malicious contents and for lateral movement. Indeed, a recent comprehensive technical report by Symantec dedicated to PowerShell's abuse by cybercrimials [52] reported on a sharp increase in the number of malicious PowerShell samples they received and in the number of penetration tools and frameworks that use PowerShell. This highlights the urgent need of developing effective methods for detecting malicious PowerShell commands. In this work, we address this challenge by implementing several novel detectors of malicious PowerShell commands and evaluating their performance. We implemented both "traditional" natural language processing (NLP) based detectors and detectors based on character-level convolutional neural networks (CNNs). Detectors' performance was evaluated using a large real-world dataset. Our evaluation results show that, although our detectors (and especially the traditional NLP-based ones) individually yield high performance, an ensemble detector that combines an NLP-based classifier with a CNN-based classifier provides the best performance, since the latter classifier is able to detect malicious commands that succeed in evading the former. Our analysis of these evasive commands reveals that some obfuscation patterns automatically detected by the CNN classifier are intrinsically difficult to detect using the NLP techniques we applied. Our detectors provide high recall values while maintaining a very low false positive rate, making us cautiously optimistic that they can be of practical value.
Internet of Things (IoT) in military setting generally consists of a diverse range of Internet-connected devices and nodes (e.g. medical devices to wearable combat uniforms), which are a valuable target for cyber criminals, particularly state-sponsored or nation state actors. A common attack vector is the use of malware. In this paper, we present a deep learning based method to detect Internet Of Battlefield Things (IoBT) malware via the device's Operational Code (OpCode) sequence. We transmute OpCodes into a vector space and apply a deep Eigenspace learning approach to classify malicious and bening application. We also demonstrate the robustness of our proposed approach in malware detection and its sustainability against junk code insertion attacks. Lastly, we make available our malware sample on Github, which hopefully will benefit future research efforts (e.g. for evaluation of proposed malware detection approaches).
In this paper, inspired by Gatys's recent work, we propose a novel approach that transforms photos to comics using deep convolutional neural networks (CNNs). While Gatys's method that uses a pre-trained VGG network generally works well for transferring artistic styles such as painting from a style image to a content image, for more minimalist styles such as comics, the method often fails to produce satisfactory results. To address this, we further introduce a dedicated comic style CNN, which is trained for classifying comic images and photos. This new network is effective in capturing various comic styles and thus helps to produce better comic stylization results. Even with a grayscale style image, Gatys's method can still produce colored output, which is not desirable for comics. We develop a modified optimization framework such that a grayscale image is guaranteed to be synthesized. To avoid converging to poor local minima, we further initialize the output image using grayscale version of the content image. Various examples show that our method synthesizes better comic images than the state-of-the-art method.
Network traffic classification is an important problem in network traffic analysis. It plays a vital role in many network tasks including quality of service, firewall enforcement and security. One of the challenging problems of classifying network traffic is the imbalanced property of network data. Usually, the amount of traffic in some classes is much higher than the amount of traffic in other classes. In this paper, we proposed an application of a deep learning approach to address imbalanced data problem in network traffic classification. We used a recent proposed deep network for unsupervised learning called Auxiliary Classifier Generative Adversarial Network to generate synthesized data samples for balancing between the minor and the major classes. We tested our method on a well-known network traffic dataset and the results showed that our proposed method achieved better performance compared to a recent proposed method for handling imbalanced problem in network traffic classification.
Large-scale mobile traffic analytics is becoming essential to digital infrastructure provisioning, public transportation, events planning, and other domains. Monitoring city-wide mobile traffic is however a complex and costly process that relies on dedicated probes. Some of these probes have limited precision or coverage, others gather tens of gigabytes of logs daily, which independently offer limited insights. Extracting fine-grained patterns involves expensive spatial aggregation of measurements, storage, and post-processing. In this paper, we propose a mobile traffic super-resolution technique that overcomes these problems by inferring narrowly localised traffic consumption from coarse measurements. We draw inspiration from image processing and design a deep-learning architecture tailored to mobile networking, which combines Zipper Network (ZipNet) and Generative Adversarial neural Network (GAN) models. This enables to uniquely capture spatio-temporal relations between traffic volume snapshots routinely monitored over broad coverage areas ('low-resolution') and the corresponding consumption at 0.05 km2 level ('high-resolution') usually obtained after intensive computation. Experiments we conduct with a real-world data set demonstrate that the proposed ZipNet(-GAN) infers traffic consumption with remarkable accuracy and up to 100X higher granularity as compared to standard probing, while outperforming existing data interpolation techniques. To our knowledge, this is the first time super-resolution concepts are applied to large-scale mobile traffic analysis and our solution is the first to infer fine-grained urban traffic patterns from coarse aggregates.
The paper presents a fully automatic end-to-end trainable system to colorize grayscale images. Colorization is a highly under-constrained problem. In order to produce realistic outputs, the proposed approach takes advantage of the recent advances in deep learning and generative networks. To achieve plausible colorization, the paper investigates conditional Wasserstein Generative Adversarial Networks (WGAN) [3] as a solution to this problem. Additionally, a loss function consisting of two classification loss components apart from the adversarial loss learned by the WGAN is proposed. The first classification loss provides a measure of how much the predicted colored images differ from ground truth. The second classification loss component makes use of ground truth semantic classification labels in order to learn meaningful intermediate features. Finally, WGAN training procedure pushes the predictions to the manifold of natural images. The system is validated using a user study and a semantic interpretability test and achieves results comparable to [1] on Imagenet dataset [10].
In recent years, there has been remarkable development in unmanned aerial vehicle UAVs); certain companies are trying to use the UAV to deliver goods also. Therefore, it is predicted that many such objects will fly over the city, in the near future. This study proposes a system for monitoring objects flying over a city. We use multiple 4K video cameras to capture videos of the flying objects. In this research, we combine background subtraction and a state-of-the-art tracking method, the KCF, for detection and tracking. We use deep learning for classification and the SfM for calculating the 3-dimensional trajectory. A UAV is flown over the inner-city area of Tokyo and videos are captured. The accuracy of each processing is verified, using the videos of objects flying over the city. In each processing, we obtain a certain measure of accuracy; thus, there is a good prospect of creating a system to monitor objects flying, over a city.
This paper is the first work to perform spatio-temporal mapping of human activity using the visual content of geo-tagged videos. We utilize a recent deep-learning based video analysis framework, termed hidden two-stream networks, to recognize a range of activities in YouTube videos. This framework is efficient and can run in real time or faster which is important for recognizing events as they occur in streaming video or for reducing latency in analyzing already captured video. This is, in turn, important for using video in smart-city applications. We perform a series of experiments to show our approach is able to map activities both spatially and temporally.
In order to improve the limitation of single-mode biometric identification technology, a bimodal biometric verification system based on deep learning is proposed in this paper. A modified CNN architecture is used to generate better facial feature for bimodal fusion. The obtained facial feature and acoustic feature extracted by the acoustic feature extraction model are fused together to form the fusion feature on feature layer level. The fusion feature obtained by this method are used to train a neural network of identifying the target person who have these corresponding features. Experimental results demonstrate the superiority and high performance of our bimodal biometric in comparison with single-mode biometrics for identity authentication, which are tested on a bimodal database consists of data coherent from TED-LIUM and CASIA-WebFace. Compared with using facial feature or acoustic feature alone, the classification accuracy of fusion feature obtained by our method is increased obviously.
Vulnerabilities need to be detected and removed from software. Although previous studies demonstrated the usefulness of employing prediction techniques in deciding about vulnerabilities of software components, the improvement of effectiveness of these prediction techniques is still a grand challenging research question. This paper employed a technique based on a deep neural network with rectifier linear units trained with stochastic gradient descent method and batch normalization, for predicting vulnerable software components. The features are defined as continuous sequences of tokens in source code files. Besides, a statistical feature selection algorithm is then employed to reduce the feature and search space. We evaluated the proposed technique based on some Java Android applications, and the results demonstrated that the proposed technique could predict vulnerable classes, i.e., software components, with high precision, accuracy and recall.
Human activity recognition is one of the important research topics in computer vision and video understanding. It is often assumed that high quality video sequences are available for recognition. However, relaxing such a requirement and implementing robust recognition using videos having reduced data rates can achieve efficiency in storing and transmitting video data. Three-dimensional video scalability, which refers to the possibility of reducing spatial, temporal, and quality resolutions of videos, is an effective way for flexible representation and management of video data. In this paper, we investigate the impact of the video scalability on multi-view activity recognition. We employ both a spatiotemporal feature extraction-based method and a deep learning-based method using convolutional and recurrent neural networks. The recognition performance of the two methods is examined, along with in-depth analysis regarding how their performance vary with respect to various scalability combinations. In particular, we demonstrate that the deep learning-based method can achieve significantly improved robustness in comparison to the feature-based method. Furthermore, we investigate optimal scalability combinations with respect to bitrate in order to provide useful guidelines for an optimal operation policy in resource-constrained activity recognition systems.
Abstract—Network intrusion detection systems (NIDS) are essential security building-blocks for today’s organizations to ensure safe and trusted communication of information. In this paper, we study the feasibility of off-line deep learning based NIDSes by constructing the detection engine with multiple advanced deep learning models and conducting a quantitative and comparative evaluation of those models. We first introduce the general deep learning methodology and its potential implication on the network intrusion detection problem. We then review multiple machine learning solutions to two network intrusion detection tasks (NSL-KDD and UNSW-NB15 datasets). We develop a TensorFlow-based deep learning library, called NetLearner, and implement a handful of cutting-edge deep learning models for NIDS. Finally, we conduct a quantitative and comparative performance evaluation of those models using NetLearner.
Distinguishing and classifying different types of malware is important to better understanding how they can infect computers and devices, the threat level they pose and how to protect against them. In this paper, a system for classifying malware programs is presented. The paper describes the architecture of the system and assesses its performance on a publicly available database (provided by Microsoft for the Microsoft Malware Classification Challenge BIG2015) to serve as a benchmark for future research efforts. First, the malicious programs are preprocessed such that they are visualized as gray scale images. We then make use of an architecture comprised of multiple layers (multiple levels of encoding) to carry out the classification process of those images/programs. We compare the performance of this approach against traditional machine learning and pattern recognition algorithms. Our experimental results show that the deep learning architecture yields a boost in performance over those conventional/standard algorithms. A hold-out validation analysis using the superior architecture shows an accuracy in the order of 99.15%.
Intrusion detection systems do not perform well when it comes to detecting zero-day attacks, therefore improving their performance in that regard is an active research topic. In this study, to detect zero-day attacks with high accuracy, we proposed two deep learning based anomaly detection models using autoencoder and denoising autoencoder respectively. The key factor that directly affects the accuracy of the proposed models is the threshold value which was determined using a stochastic approach rather than the approaches available in the current literature. The proposed models were tested using the KDDTest+ dataset contained in NSL-KDD, and we achieved an accuracy of 88.28% and 88.65% respectively. The obtained results show that, as a singular model, our proposed anomaly detection models outperform any other singular anomaly detection methods and they perform almost the same as the newly suggested hybrid anomaly detection models.
State-of-the-art convolutional neural networks (ConvNets) are now able to achieve near human performance on a wide range of classification tasks. Unfortunately, current hardware implementations of ConvNets are memory power intensive, prohibiting deployment in low-power embedded systems and IoE platforms. One method of reducing memory power is to exploit the error resilience of ConvNets and accept bit errors under reduced supply voltages. In this paper, we extensively study the effectiveness of this idea and show that further savings are possible by injecting bit errors during ConvNet training. Measurements on an 8KB SRAM in 28nm UTBB FD-SOI CMOS demonstrate supply voltage reduction of 310mV, which results in up to 5.4× leakage power reduction and up to 2.9× memory access power reduction at 99% of floating-point classification accuracy, with no additional hardware cost. To our knowledge, this is the first silicon-validated study on the effect of bit errors in ConvNets.
Cryptanalysis (the study of methods to read encrypted information without knowledge of the encryption key) has traditionally been separated into mathematical analysis of weaknesses in cryptographic algorithms, on the one hand, and side-channel attacks which aim to exploit weaknesses in the implementation of encryption and decryption algorithms. Mathematical analysis generally makes assumptions about the algorithm with the aim of reconstructing the key relating plain text to cipher text through brute-force methods. Complexity issues tend to dominate the systematic search for keys. To date, there has been very little research on a third cryptanalysis method: learning the key through convergence based on associations between plain text and cipher text. Recent advances in deep learning using multi-layered artificial neural networks (ANNs) provide an opportunity to reassess the role of deep learning architectures in next generation cryptanalysis methods based on neurocryptography (NC). In this paper, we explore the capability of deep ANNs to decrypt encrypted messages with minimum knowledge of the algorithm. From the experimental results, it can be concluded that DNNs can encrypt and decrypt to levels of accuracy that are not 100% because of the stochastic aspects of ANNs. This aspect may however be useful if communication is under cryptanalysis attack, since the attacker will not know for certain that key K used for encryption and decryption has been found. Also, uncertainty concerning the architecture used for encryption and decryption adds another layer of uncertainty that has no counterpart in traditional cryptanalysis.
Here we explore the applicability of traditional sliding window based convolutional neural network (CNN) detection pipeline and region based object detection techniques such as Faster Region-based CNN (R-CNN) and Region-based Fully Convolutional Networks (R-FCN) on the problem of object detection in X-ray security imagery. Within this context, with limited dataset availability, we employ a transfer learning paradigm for network training tackling both single and multiple object detection problems over a number of R-CNN/R-FCN variants. The use of first-stage region proposal within the Faster RCNN and R-FCN provide superior results than traditional sliding window driven CNN (SWCNN) approach. With the use of Faster RCNN with VGG16, pretrained on the ImageNet dataset, we achieve 88.3 mAP for a six object class X-ray detection problem. The use of R-FCN with ResNet-101, yields 96.3 mAP for the two class firearm detection problem requiring 0.1 second computation per image. Overall we illustrate the comparative performance of these techniques as object localization strategies within cluttered X-ray security imagery.
Deep Learning has recently become hugely popular in machine learning for its ability to solve end-to-end learning systems, in which the features and the classifiers are learned simultaneously, providing significant improvements in classification accuracy in the presence of highly-structured and large databases. Its success is due to a combination of recent algorithmic breakthroughs, increasingly powerful computers, and access to significant amounts of data. Researchers have also considered privacy implications of deep learning. Models are typically trained in a centralized manner with all the data being processed by the same training algorithm. If the data is a collection of users' private data, including habits, personal pictures, geographical positions, interests, and more, the centralized server will have access to sensitive information that could potentially be mishandled. To tackle this problem, collaborative deep learning models have recently been proposed where parties locally train their deep learning structures and only share a subset of the parameters in the attempt to keep their respective training sets private. Parameters can also be obfuscated via differential privacy (DP) to make information extraction even more challenging, as proposed by Shokri and Shmatikov at CCS'15. Unfortunately, we show that any privacy-preserving collaborative deep learning is susceptible to a powerful attack that we devise in this paper. In particular, we show that a distributed, federated, or decentralized deep learning approach is fundamentally broken and does not protect the training sets of honest participants. The attack we developed exploits the real-time nature of the learning process that allows the adversary to train a Generative Adversarial Network (GAN) that generates prototypical samples of the targeted training set that was meant to be private (the samples generated by the GAN are intended to come from the same distribution as the training data). Interestingly, we show that record-level differential privacy applied to the shared parameters of the model, as suggested in previous work, is ineffective (i.e., record-level DP is not designed to address our attack).
This work describes how automated data generation integrates in a big data pipeline. A lack of veracity in big data can cause models that are inaccurate, or biased by trends in the training data. This can lead to issues as a pipeline matures that are difficult to overcome. This work describes the use of a Generative Adversarial Network to generate sketch data, such as those that might be used in a human verification task. These generated sketches are verified as recognizable using a crowd-sourcing methodology, and finds that the generated sketches were correctly recognized 43.8% of the time, in contrast to human drawn sketches which were 87.7% accurate. This method is scalable and can be used to generate realistic data in many domains and bootstrap a dataset used for training a model prior to deployment.