Biblio
The world is fundamentally compositional, so it is natural to think of visual recognition as the recognition of basic visually primitives that are composed according to well-defined rules. This strategy allows us to recognize unseen complex concepts from simple visual primitives. However, the current trend in visual recognition follows a data greedy approach where huge amounts of data are required to learn models for any desired visual concept. In this paper, we build on the compositionality principle and develop an "algebra" to compose classifiers for complex visual concepts. To this end, we learn neural network modules to perform boolean algebra operations on simple visual classifiers. Since these modules form a complete functional set, a classifier for any complex visual concept defined as a boolean expression of primitives can be obtained by recursively applying the learned modules, even if we do not have a single training sample. As our experiments show, using such a framework, we can compose classifiers for complex visual concepts outperforming standard baselines on two well-known visual recognition benchmarks. Finally, we present a qualitative analysis of our method and its properties.
Phishing is the major problem of the internet era. In this era of internet the security of our data in web is gaining an increasing importance. Phishing is one of the most harmful ways to unknowingly access the credential information like username, password or account number from the users. Users are not aware of this type of attack and later they will also become a part of the phishing attacks. It may be the losses of financial found, personal information, reputation of brand name or trust of brand. So the detection of phishing site is necessary. In this paper we design a framework of phishing detection using URL.
Short messages usage has been tremendously increased such as SMS, tweets and status updates. Due to its popularity and ease of use, many companies use it for advertisement purpose. Hackers also use SMS to defraud users and steal personal information. In this paper, the use of Graphs centrality metrics is proposed for spam SMS detection. The graph centrality measures: degree, closeness, and eccentricity are used for classification of SMS. Graphs for each class are created using labeled SMS and then unlabeled SMS is classified using the centrality scores of the token available in the unclassified SMS. Our results show that highest precision and recall is achieved by using degree centrality. Degree centrality achieved the highest precision i.e. 0.81 and recall i.e., 0.76 for spam messages.
In recent years, deep convolution neural networks (DCNNs) have won many contests in machine learning, object detection, and pattern recognition. Furthermore, deep learning techniques achieved exceptional performance in image classification, reaching accuracy levels beyond human capability. Malware variants from similar categories often contain similarities due to code reuse. Converting malware samples into images can cause these patterns to manifest as image features, which can be exploited for DCNN classification. Techniques for converting malware binaries into images for visualization and classification have been reported in the literature, and while these methods do reach a high level of classification accuracy on training datasets, they tend to be vulnerable to overfitting and perform poorly on previously unseen samples. In this paper, we explore and document a variety of techniques for representing malware binaries as images with the goal of discovering a format best suited for deep learning. We implement a database for malware binaries from several families, stored in hexadecimal format. These malware samples are converted into images using various approaches and are used to train a neural network to recognize visual patterns in the input and classify malware based on the feature vectors. Each image type is assessed using a variety of learning models, such as transfer learning with existing DCNN architectures and feature extraction for support vector machine classifier training. Each technique is evaluated in terms of classification accuracy, result consistency, and time per trial. Our preliminary results indicate that improved image representation has the potential to enable more effective classification of new malware.
As drone attracts much interest, the drone industry has opened their market to ordinary people, making drones to be used in daily lives. However, as it got easier for drone to be used by more people, safety and security issues have raised as accidents are much more likely to happen: colliding into people by losing control or invading secured properties. For safety purposes, it is essential for observers and drone to be aware of an approaching drone. In this paper, we introduce a comprehensive drone detection system based on machine learning. This system is designed to be operable on drones with camera. Based on the camera images, the system deduces location on image and vendor model of drone based on machine classification. The system is actually built with OpenCV library. We collected drone imagery and information for learning process. The system's output shows about 89 percent accuracy.
We explore methods of producing adversarial examples on deep generative models such as the variational autoencoder (VAE) and the VAE-GAN. Deep learning architectures are known to be vulnerable to adversarial examples, but previous work has focused on the application of adversarial examples to classification tasks. Deep generative models have recently become popular due to their ability to model input data distributions and generate realistic examples from those distributions. We present three classes of attacks on the VAE and VAE-GAN architectures and demonstrate them against networks trained on MNIST, SVHN and CelebA. Our first attack leverages classification-based adversaries by attaching a classifier to the trained encoder of the target generative model, which can then be used to indirectly manipulate the latent representation. Our second attack directly uses the VAE loss function to generate a target reconstruction image from the adversarial example. Our third attack moves beyond relying on classification or the standard loss for the gradient and directly optimizes against differences in source and target latent representations. We also motivate why an attacker might be interested in deploying such techniques against a target generative network.
The detection of cyber-attacks has become a crucial task for highly sophisticated systems like industrial control systems (ICS). These systems are an essential part of critical information infrastructure. Therefore, we can highlight their vital role in contemporary society. The effective and reliable ICS cyber defense is a significant challenge for the cyber security community. Thus, intrusion detection is one of the demanding tasks for the cyber security researchers. In this article, we examine classification problem. The proposed detection system is based on supervised anomaly detection techniques. Moreover, we utilized classifiers algorithms in order to increase intrusion detection capabilities. The fusion of the classifiers is the way how to achieve the predefined goal.
Distinguishing and classifying different types of malware is important to better understanding how they can infect computers and devices, the threat level they pose and how to protect against them. In this paper, a system for classifying malware programs is presented. The paper describes the architecture of the system and assesses its performance on a publicly available database (provided by Microsoft for the Microsoft Malware Classification Challenge BIG2015) to serve as a benchmark for future research efforts. First, the malicious programs are preprocessed such that they are visualized as gray scale images. We then make use of an architecture comprised of multiple layers (multiple levels of encoding) to carry out the classification process of those images/programs. We compare the performance of this approach against traditional machine learning and pattern recognition algorithms. Our experimental results show that the deep learning architecture yields a boost in performance over those conventional/standard algorithms. A hold-out validation analysis using the superior architecture shows an accuracy in the order of 99.15%.
The correct prediction of faulty modules or classes has a number of advantages such as improving the quality of software and assigning capable development resources to fix such faults. There have been different kinds of fault/defect prediction models proposed in literature, but a great majority of them makes use of static code metrics as independent variables for making predictions. Recently, process metrics have gained a considerable attention as alternative metrics to use for making trust-worthy predictions. The objective of this paper is to investigate different combinations of static code and process metrics for evaluating fault prediction performance. We have used publicly available data sets, along with a frequently used classifier, Naive Bayes, to run our experiments. We have, both statistically and visually, analyzed our experimental results. The statistical analysis showed evidence against any significant difference in fault prediction performances for a variety of different combinations of metrics. This reinforced earlier research results that process metrics are as good as predictors of fault proneness as static code metrics. Furthermore, the visual inspection of box plots revealed that the best set of metrics for fault prediction is a mix of both static code and process metrics. We also presented evidence in support of some process metrics being more discriminating than others and thus making them as good predictors to use.
Most anti-phishing solutions that exist today require scanning a large portion of the web, which is vast and equivalent to finding a needle in a haystack. In addition, such solutions are not very efficient. We propose a different approach. Our solution does not rely on the scanning of the entire Internet or a large portion of it and only needs access to the brand's traffic in order to be able to detect phishing attempts against that brand. By analyzing a sample of phishing websites, we find features that can be used to distinguish phishing websites from the legitimate ones. We then use these features to train a machine learning classifier capable of helping brands detect phishing attempts against them. Our approach can detect up to 86% of phishing attacks against the brands and is best used as a complementary tool to the existing anti-phishing solutions.