Biblio
At a time when all it takes to open a Twitter account is a mobile phone, the act of authenticating information encountered on social media becomes very complex, especially when we lack measures to verify digital identities in the first place. Because the platform supports anonymity, fake news generated by dubious sources have been observed to travel much faster and farther than real news. Hence, we need valid measures to identify authors of misinformation to avert these consequences. Researchers propose different authorship attribution techniques to approach this kind of problem. However, because tweets are made up of only 280 characters, finding a suitable authorship attribution technique is a challenge. This research aims to classify authors of tweets by comparing machine learning methods like logistic regression and naive Bayes. The processes of this application are fetching of tweets, pre-processing, feature extraction, and developing a machine learning model for classification. This paper illustrates the text classification for authorship process using machine learning techniques. In total, there were 46,895 tweets used as both training and testing data, and unique features specific to Twitter were extracted. Several steps were done in the pre-processing phase, including removal of short texts, removal of stop-words and punctuations, tokenizing and stemming of texts as well. This approach transforms the pre-processed data into a set of feature vector in Python. Logistic regression and naive Bayes algorithms were applied to the set of feature vectors for the training and testing of the classifier. The logistic regression based classifier gave the highest accuracy of 91.1% compared to the naive Bayes classifier with 89.8%.
The security of image steganography is an important basis for evaluating steganography algorithms. Steganography has recently made great progress in the long-term confrontation with steganalysis. To improve the security of image steganography, steganography must have the ability to resist detection by steganalysis algorithms. Traditional embedding-based steganography embeds the secret information into the content of an image, which unavoidably leaves a trace of the modification that can be detected by increasingly advanced machine-learning-based steganalysis algorithms. The concept of steganography without embedding (SWE), which does not need to modify the data of the carrier image, appeared to overcome the detection of machine-learning-based steganalysis algorithms. In this paper, we propose a novel image SWE method based on deep convolutional generative adversarial networks. We map the secret information into a noise vector and use the trained generator neural network model to generate the carrier image based on the noise vector. No modification or embedding operations are required during the process of image generation, and the information contained in the image can be extracted successfully by another neural network, called the extractor, after training. The experimental results show that this method has the advantages of highly accurate information extraction and a strong ability to resist detection by state-of-the-art image steganalysis algorithms.
This paper presents an assessment of continuous verification using linguistic style as a cognitive biometric. In stylometry, it is widely known that linguistic style is highly characteristic of authorship using representations that capture authorial style at character, lexical, syntactic, and semantic levels. In this work, we provide a contrast to previous efforts by implementing a one-class classification problem using Isolation Forests. Our approach demonstrates the usefulness of this classifier for accurately verifying the genuine user, and yields recognition accuracy exceeding 98% using very small training samples of 50 and 100-character blocks.
Machine learning (ML) algorithms provide a good solution for many security sensitive applications, they themselves, however, face the threats of adversary attacks. As a key problem in machine learning, how to design robust feature selection algorithms against these attacks becomes a hot issue. The current researches on defending evasion attacks mainly focus on wrapped adversarial feature selection algorithm, i.e., WAFS, which is dependent on the classification algorithms, and time cost is very high for large-scale data. Since mRMR (minimum Redundancy and Maximum Relevance) algorithm is one of the most popular filter algorithms for feature selection without considering any classifier during feature selection process. In this paper, we propose a novel adversary-aware feature selection algorithm under filter model based on mRMR, named FAFS. The algorithm, on the one hand, takes the correlation between a single feature and a label, and the redundancy between features into account; on the other hand, when selecting features, it not only considers the generalization ability in the absence of attack, but also the robustness under attack. The performance of four algorithms, i.e., mRMR, TWFS (Traditional Wrapped Feature Selection algorithm), WAFS, and FAFS is evaluated on spam filtering and PDF malicious detection in the Perfect Knowledge attack scenarios. The experiment results show that FAFS has a better performance under evasion attacks with less time complexity, and comparable classification accuracy.
Machine learning (ML) models are often trained using private datasets that are very expensive to collect, or highly sensitive, using large amounts of computing power. The models are commonly exposed either through online APIs, or used in hardware devices deployed in the field or given to the end users. This provides an incentive for adversaries to steal these ML models as a proxy for gathering datasets. While API-based model exfiltration has been studied before, the theft and protection of machine learning models on hardware devices have not been explored as of now. In this work, we examine this important aspect of the design and deployment of ML models. We illustrate how an attacker may acquire either the model or the model architecture through memory probing, side-channels, or crafted input attacks, and propose (1) power-efficient obfuscation as an alternative to encryption, and (2) timing side-channel countermeasures.
Along with the rapid development of hardware security techniques, the revolutionary growth of countermeasures or attacking methods developed by intelligent and adaptive adversaries have significantly complicated the ability to create secure hardware systems. Thus, there is a critical need to (re)evaluate existing or new hardware security techniques against these state-of-the-art attacking methods. With this in mind, this paper presents a novel framework for incorporating active learning techniques into hardware security field. We demonstrate that active learning can significantly improve the learning efficiency of physical unclonable function (PUF) modeling attack, which samples the least confident and the most informative challenge-response pair (CRP) for training in each iteration. For example, our experimental results show that in order to obtain a prediction error below 4%, 2790 CRPs are required in passive learning, while only 811 CRPs are required in active learning. The sampling strategies and detailed applications of PUF modeling attack under various environmental conditions are also discussed. When the environment is very noisy, active learning may sample a large number of mislabeled CRPs and hence result in high prediction error. We present two methods to mitigate the contradiction between informative and noisy CRPs.
We explore methods of producing adversarial examples on deep generative models such as the variational autoencoder (VAE) and the VAE-GAN. Deep learning architectures are known to be vulnerable to adversarial examples, but previous work has focused on the application of adversarial examples to classification tasks. Deep generative models have recently become popular due to their ability to model input data distributions and generate realistic examples from those distributions. We present three classes of attacks on the VAE and VAE-GAN architectures and demonstrate them against networks trained on MNIST, SVHN and CelebA. Our first attack leverages classification-based adversaries by attaching a classifier to the trained encoder of the target generative model, which can then be used to indirectly manipulate the latent representation. Our second attack directly uses the VAE loss function to generate a target reconstruction image from the adversarial example. Our third attack moves beyond relying on classification or the standard loss for the gradient and directly optimizes against differences in source and target latent representations. We also motivate why an attacker might be interested in deploying such techniques against a target generative network.
Adversaries are conducting attack campaigns with increasing levels of sophistication. Additionally, with the prevalence of out-of-the-box toolkits that simplify attack operations during different stages of an attack campaign, multiple new adversaries and attack groups have appeared over the past decade. Characterizing the behavior and the modus operandi of different adversaries is critical in identifying the appropriate security maneuver to detect and mitigate the impact of an ongoing attack. To this end, in this paper, we study two characteristics of an adversary: Risk-averseness and Experience level. Risk-averse adversaries are more cautious during their campaign while fledgling adversaries do not wait to develop adequate expertise and knowledge before launching attack campaigns. One manifestation of these characteristics is through the adversary's choice and usage of attack tools. To detect these characteristics, we present multi-level machine learning (ML) models that use network data generated while under attack by different attack tools and usage patterns. In particular, for risk-averseness, we considered different configurations for scanning tools and trained the models in a testbed environment. The resulting model was used to predict the cautiousness of different red teams that participated in the Cyber Shield ‘16 exercise. The predictions matched the expected behavior of the red teams. For Experience level, we considered publicly-available remote access tools and usage patterns. We developed a Markov model to simulate usage patterns of attackers with different levels of expertise and through experiments on CyberVAN, we showed that the ML model has a high accuracy.
With an increase in targeted attacks such as advanced persistent threats (APTs), enterprise system defenders require comprehensive frameworks that allow them to collaborate and evaluate their defense systems against such attacks. MITRE has developed a framework which includes a database of different kill-chains, tactics, techniques, and procedures that attackers employ to perform these attacks. In this work, we leverage natural language processing techniques to extract attacker actions from threat report documents generated by different organizations and automatically classify them into standardized tactics and techniques, while providing relevant mitigation advisories for each attack. A naïve method to achieve this is by training a machine learning model to predict labels that associate the reports with relevant categories. In practice, however, sufficient labeled data for model training is not always readily available, so that training and test data come from different sources, resulting in bias. A naïve model would typically underperform in such a situation. We address this major challenge by incorporating an importance weighting scheme called bias correction that efficiently utilizes available labeled data, given threat reports, whose categories are to be automatically predicted. We empirically evaluated our approach on 18,257 real-world threat reports generated between year 2000 and 2018 from various computer security organizations to demonstrate its superiority by comparing its performance with an existing approach.
Now a days, Cloud computing has brought a unbelievable change in companies, organizations, firm and institutions etc. IT industries is advantage with low investment in infrastructure and maintenance with the growth of cloud computing. The Virtualization technique is examine as the big thing in cloud computing. Even though, cloud computing has more benefits; the disadvantage of the cloud computing environment is ensuring security. Security means, the Cloud Service Provider to ensure the basic integrity, availability, privacy, confidentiality, authentication and authorization in data storage, virtual machine security etc. In this paper, we presented a Local outlier factors mechanism, which may be helpful for the detection of Distributed Denial of Service attack in a cloud computing environment. As DDoS attack becomes strong with the passing of time, and then the attack may be reduced, if it is detected at first. So we fully focused on detecting DDoS attack to secure the cloud environment. In addition, our scheme is able to identify their possible sources, giving important clues for cloud computing administrators to spot the outliers. By using WEKA (Waikato Environment for Knowledge Analysis) we have analyzed our scheme with other clustering algorithm on the basis of higher detection rates and lower false alarm rate. DR-LOF would serve as a better DDoS detection tool, which helps to improve security framework in cloud computing.
Intrusion detection systems define an important and dynamic research area for cybersecurity. The role of Intrusion Detection System within security architecture is to improve a security level by identification of all malicious and also suspicious events that could be observed in computer or network system. One of the more specific research areas related to intrusion detection is anomaly detection. Anomaly-based intrusion detection in networks refers to the problem of finding untypical events in the observed network traffic that do not conform to the expected normal patterns. It is assumed that everything that is untypical/anomalous could be dangerous and related to some security events. To detect anomalies many security systems implements a classification or clustering algorithms. However, recent research proved that machine learning models might misclassify adversarial events, e.g. observations which were created by applying intentionally non-random perturbations to the dataset. Such weakness could increase of false negative rate which implies undetected attacks. This fact can lead to one of the most dangerous vulnerabilities of intrusion detection systems. The goal of the research performed was verification of the anomaly detection systems ability to resist this type of attack. This paper presents the preliminary results of tests taken to investigate existence of attack vector, which can use adversarial examples to conceal a real attack from being detected by intrusion detection systems.
As a new research hotspot in the field of artificial intelligence, deep reinforcement learning (DRL) has achieved certain success in various fields such as robot control, computer vision, natural language processing and so on. At the same time, the possibility of its application being attacked and whether it have a strong resistance to strike has also become a hot topic in recent years. Therefore, we select the representative Deep Q Network (DQN) algorithm in deep reinforcement learning, and use the robotic automatic pathfinding application as a countermeasure application scenario for the first time, and attack DQN algorithm against the vulnerability of the adversarial samples. In this paper, we first use DQN to find the optimal path, and analyze the rules of DQN pathfinding. Then, we propose a method that can effectively find vulnerable points towards White-Box Q table variation in DQN pathfinding training. Finally, we build a simulation environment as a basic experimental platform to test our method, through multiple experiments, we can successfully find the adversarial examples and the experimental results show that the supervised method we proposed is effective.
Attackers create new threats and constantly change their behavior to mislead security systems. In this paper, we propose an adaptive threat detection architecture that trains its detection models in real time. The major contributions of the proposed architecture are: i) gather data about zero-day attacks and attacker behavior using honeypots in the network; ii) process data in real time and achieve high processing throughput through detection schemes implemented with stream processing technology; iii) use of two real datasets to evaluate our detection schemes, the first from a major network operator in Brazil and the other created in our lab; iv) design and development of adaptive detection schemes including both online trained supervised classification schemes that update their parameters in real time and learn zero-day threats from the honeypots, and online trained unsupervised anomaly detection schemes that model legitimate user behavior and adapt to changes. The performance evaluation results show that proposed architecture maintains an excellent trade-off between threat detection and false positive rates and achieves high classification accuracy of more than 90%, even with legitimate behavior changes and zero-day threats.
Acoustic speaker recognition systems are very vulnerable to spoofing attacks via replayed or synthesized utterances. One possible countermeasure is audio-visual speaker recognition. Nevertheless, the addition of the visual stream alone does not prevent spoofing attacks completely and only provides further information to assess the authenticity of the utterance. Many systems consider audio and video modalities independently and can easily be spoofed by imitating only a single modality or by a bimodal replay attack with a victim's photograph or video. Therefore, we propose the simultaneous verification of the data synchronicity and the transcription in a challenge-response setup. We use coupled hidden Markov models (CHMMs) for a text-dependent spoofing detection and introduce new features that provide information about the transcriptions of the utterance and the synchronicity of both streams. We evaluate the features for various spoofing scenarios and show that the combination of the features leads to a more robust recognition, also in comparison to the baseline method. Additionally, by evaluating the data on unseen speakers, we show the spoofing detection to be applicable in speaker-independent use-cases.
Training a feed-forward network for the fast neural style transfer of images has proven successful, but the naive extension of processing videos frame by frame is prone to producing flickering results. We propose the first end-to-end network for online video style transfer, which generates temporally coherent stylized video sequences in near realtime. Two key ideas include an efficient network by incorporating short-term coherence, and propagating short-term coherence to long-term, which ensures consistency over a longer period of time. Our network can incorporate different image stylization networks and clearly outperforms the per-frame baseline both qualitatively and quantitatively. Moreover, it can achieve visually comparable coherence to optimization-based video style transfer, but is three orders of magnitude faster.
Gatys et al. recently introduced a neural algorithm that renders a content image in the style of another image, achieving so-called style transfer. However, their framework requires a slow iterative optimization process, which limits its practical application. Fast approximations with feed-forward neural networks have been proposed to speed up neural style transfer. Unfortunately, the speed improvement comes at a cost: the network is usually tied to a fixed set of styles and cannot adapt to arbitrary new styles. In this paper, we present a simple yet effective approach that for the first time enables arbitrary style transfer in real-time. At the heart of our method is a novel adaptive instance normalization (AdaIN) layer that aligns the mean and variance of the content features with those of the style features. Our method achieves speed comparable to the fastest existing approach, without the restriction to a pre-defined set of styles. In addition, our approach allows flexible user controls such as content-style trade-off, style interpolation, color & spatial controls, all using a single feed-forward neural network.
Transferring artistic styles onto everyday photographs has become an extremely popular task in both academia and industry. Recently, offline training has replaced online iterative optimization, enabling nearly real-time stylization. When those stylization networks are applied directly to high-resolution images, however, the style of localized regions often appears less similar to the desired artistic style. This is because the transfer process fails to capture small, intricate textures and maintain correct texture scales of the artworks. Here we propose a multimodal convolutional neural network that takes into consideration faithful representations of both color and luminance channels, and performs stylization hierarchically with multiple losses of increasing scales. Compared to state-of-the-art networks, our network can also perform style transfer in nearly real-time by performing much more sophisticated training offline. By properly handling style and texture cues at multiple scales using several modalities, we can transfer not just large-scale, obvious style cues but also subtle, exquisite ones. That is, our scheme can generate results that are visually pleasing and more similar to multiple desired artistic styles with color and texture cues at multiple scales.
Recent research endeavors have shown the potential of using feed-forward convolutional neural networks to accomplish fast style transfer for images. In this work, we take one step further to explore the possibility of exploiting a feed-forward network to perform style transfer for videos and simultaneously maintain temporal consistency among stylized video frames. Our feed-forward network is trained by enforcing the outputs of consecutive frames to be both well stylized and temporally consistent. More specifically, a hybrid loss is proposed to capitalize on the content information of input frames, the style information of a given style image, and the temporal information of consecutive frames. To calculate the temporal loss during the training stage, a novel two-frame synergic training mechanism is proposed. Compared with directly applying an existing image style transfer method to videos, our proposed method employs the trained network to yield temporally consistent stylized videos which are much more visually pleasant. In contrast to the prior video style transfer method which relies on time-consuming optimization on the fly, our method runs in real time while generating competitive visual results.
At present, mobile terminals are widely used in power system and easy to be the target or springboard to attack the power system. It is necessary to have security assessment of power mobile terminal system to enable early warning of potential risks. In the context, this paper builds the security assessment system against to power mobile terminals, with features from security assessment system of general mobile terminals and power application scenarios. Compared with the existing methods, this paper introduces machine learning to the Rank Correlation Analysis method, which relies on expert experience, and uses objective experimental data to optimize the weight parameters of the indicators. From experiments, this paper proves that weights self-learning method can be used to evaluate the security of power mobile terminal system and improve credibility of the result.