Biblio
Classifying malware programs is a research area attracting great interest for Anti-Malware industry. In this research, we propose a system that visualizes malware programs as images and distinguishes those using Convolutional Neural Networks (CNNs). We study the performance of several well-established CNN based algorithms such as AlexNet, ResNet and VGG16 using transfer learning approaches. We also propose a computationally efficient CNN-based architecture for classification of malware programs. In addition, we study the performance of these CNNs as feature extractors by using Support Vector Machine (SVM) and K-nearest Neighbors (kNN) for classification purposes. We also propose fusion methods to boost the performance further. We make use of the publicly available database provided by Microsoft Malware Classification Challenge (BIG 2015) for this study. Our overall performance is 99.4% for a set of 2174 test samples comprising 9 different classes thereby setting a new benchmark.
The clear, social, and dark web have lately been identified as rich sources of valuable cyber-security information that -given the appropriate tools and methods-may be identified, crawled and subsequently leveraged to actionable cyber-threat intelligence. In this work, we focus on the information gathering task, and present a novel crawling architecture for transparently harvesting data from security websites in the clear web, security forums in the social web, and hacker forums/marketplaces in the dark web. The proposed architecture adopts a two-phase approach to data harvesting. Initially a machine learning-based crawler is used to direct the harvesting towards websites of interest, while in the second phase state-of-the-art statistical language modelling techniques are used to represent the harvested information in a latent low-dimensional feature space and rank it based on its potential relevance to the task at hand. The proposed architecture is realised using exclusively open-source tools, and a preliminary evaluation with crowdsourced results demonstrates its effectiveness.
In the last few years, cryptocurrency mining has become more and more important on the Internet activity and nowadays is even having a noticeable impact on the global economy. This has motivated the emergence of a new malicious activity called cryptojacking, which consists of compromising other machines connected to the Internet and leverage their resources to mine cryptocurrencies. In this context, it is of particular interest for network administrators to detect possible cryptocurrency miners using network resources without permission. Currently, it is possible to detect them using IP address lists from known mining pools, processing information from DNS traffic, or directly performing Deep Packet Inspection (DPI) over all the traffic. However, all these methods are still ineffective to detect miners using unknown mining servers or result too expensive to be deployed in real-world networks with large traffic volume. In this paper, we present a machine learning-based method able to detect cryptocurrency miners using NetFlow/IPFIX network measurements. Our method does not require to inspect the packets' payload; as a result, it achieves cost-efficient miner detection with similar accuracy than DPI-based techniques.
Nowadays, many applications involve big data and big data analysis methods appear in many fields. As a preliminary attempt to solve the challenge of big data analysis, this paper presents a distributed online learning algorithm based on differential privacy. Since online learning can effectively process sensitive data, we introduce the concept of differential privacy in distributed online learning algorithms, with the aim at ensuring data privacy during online learning to prevent adversarial nodes from inferring any important data information. In particular, for different adversary models, we consider different type graphs to tolerate a limited number of adversaries near each regular node or tolerate a global limited number of adversaries.
Automatic emotion recognition using computer vision is significant for many real-world applications like photojournalism, virtual reality, sign language recognition, and Human Robot Interaction (HRI) etc., Psychological research findings advocate that humans depend on the collective visual conduits of face and body to comprehend human emotional behaviour. Plethora of studies have been done to analyse human emotions using facial expressions, EEG signals and speech etc., Most of the work done was based on single modality. Our objective is to efficiently integrate emotions recognized from facial expressions and upper body pose of humans using images. Our work on bimodal emotion recognition provides the benefits of the accuracy of both the modalities.
Trust is known to be a key component in human social relationships. It is trust that defines human behavior with others to a large extent. Generative models have been extensively used in social networks study to simulate different characteristics and phenomena in social graphs. In this work, an attempt is made to understand how trust in social graphs can be combined with generative modeling techniques to generate trust-based social graphs. These generated social graphs are then compared with the original social graphs to evaluate how trust helps in generative modeling. Two well-known social network data sets i.e. the soc-Bitcoin and the wiki administrator network data sets are used in this work. Social graphs are generated from these data sets and then compared with the original graphs along with other standard generative modeling techniques to see how trust is a good component in this. Other Generative modeling techniques have been available for a while but this investigation with the real social graph data sets validate that trust can be an important factor in generative modeling.
The upsurge of Industrial Internet of Things is forcing industrial information systems to enable less hierarchical information flow. The connections between humans, devices, and their digital twins are growing in numbers, creating a need for new kind of security and trust solutions. To address these needs, industries are applying distributed ledger technologies, aka blockchains. A significant number of use cases have been studied in the sectors of logistics, energy markets, smart grid security, and food safety, with frequently reported benefits in transparency, reduced costs, and disintermediation. However, distributed ledger technologies have challenges with transaction throughput, latency, and resource requirements, which render the technology unusable in many cases, particularly with constrained Internet of Things devices.To overcome these challenges within the Industrial Internet of Things, we suggest a set of interledger approaches that enable trusted information exchange across different ledgers and constrained devices. With these approaches, the technically most suitable ledger technology can be selected for each use case while simultaneously enjoying the benefits of the most widespread ledger implementations. We present state of the art for distributed ledger technologies to support the use of interledger approaches in industrial settings.
Witnessing the increasingly pervasive deployment of security video surveillance systems(VSS), more and more individuals have become concerned with the issues of privacy violations. While the majority of the public have a favorable view of surveillance in terms of crime deterrence, individuals do not accept the invasive monitoring of their private life. To date, however, there is not a lightweight and secure privacy-preserving solution for video surveillance systems. The recent success of blockchain (BC) technologies and their applications in the Internet of Things (IoT) shed a light on this challenging issue. In this paper, we propose a Lightweight, Blockchain-based Privacy protection (Lib-Pri) scheme for surveillance cameras at the edge. It enables the VSS to perform surveillance without compromising the privacy of people captured in the videos. The Lib-Pri system transforms the deployed VSS into a system that functions as a federated blockchain network capable of carrying out integrity checking, blurring keys management, feature sharing, and video access sanctioning. The policy-based enforcement of privacy measures is carried out at the edge devices for real-time video analytics without cluttering the network.
We propose a distributed machine-learning architecture to predict trustworthiness of sensor services in Mobile Edge Computing (MEC) based Internet of Things (IoT) services, which aligns well with the goals of MEC and requirements of modern IoT systems. The proposed machine-learning architecture models training a distributed trust prediction model over a topology of MEC-environments as a Network Lasso problem, which allows simultaneous clustering and optimization on large-scale networked-graphs. We then attempt to solve it using Alternate Direction Method of Multipliers (ADMM) in a way that makes it suitable for MEC-based IoT systems. We present analytical and simulation results to show the validity and efficiency of the proposed solution.
The task of attack attribution, i.e., identifying the entity responsible for an attack, is complicated and usually requires the involvement of an experienced security expert. Prior attempts to automate attack attribution apply various machine learning techniques on features extracted from the malware's code and behavior in order to identify other similar malware whose authors are known. However, the same malware can be reused by multiple actors, and the actor who performed an attack using a malware might differ from the malware's author. Moreover, information collected during an incident may contain many clues about the identity of the attacker in addition to the malware used. In this paper, we propose a method of attack attribution based on textual analysis of threat intelligence reports, using state of the art algorithms and models from the fields of machine learning and natural language processing (NLP). We have developed a new text representation algorithm which captures the context of the words and requires minimal feature engineering. Our approach relies on vector space representation of incident reports derived from a small collection of labeled reports and a large corpus of general security literature. Both datasets have been made available to the research community. Experimental results show that the proposed representation can attribute attacks more accurately than the baselines' representations. In addition, we show how the proposed approach can be used to identify novel previously unseen threat actors and identify similarities between known threat actors.
Machine learning is a major area in artificial intelligence, which enables computer to learn itself explicitly without programming. As machine learning is widely used in making decision automatically, attackers have strong intention to manipulate the prediction generated my machine learning model. In this paper we study about the different types of attacks and its countermeasures on machine learning model. By research we found that there are many security threats in various algorithms such as K-nearest-neighbors (KNN) classifier, random forest, AdaBoost, support vector machine (SVM), decision tree, we revisit existing security threads and check what are the possible countermeasures during the training and prediction phase of machine learning model. In machine learning model there are 2 types of attacks that is causative attack which occurs during the training phase and exploratory attack which occurs during the prediction phase, we will also discuss about the countermeasures on machine learning model, the countermeasures are data sanitization, algorithm robustness enhancement, and privacy preserving techniques.
Machine learning has been adopted widely to perform prediction and classification. Implementing machine learning increases security risks when computation process involves sensitive data on training and testing computations. We present a proposed system to protect machine learning engines in IoT environment without modifying internal machine learning architecture. Our proposed system is designed for passwordless and eliminated the third-party in executing machine learning transactions. To evaluate our a proposed system, we conduct experimental with machine learning transactions on IoT board and measure computation time each transaction. The experimental results show that our proposed system can address security issues on machine learning computation with low time consumption.
In this study, we conducted a survey of those who have used E-Government Services (civil servants, employees of public institutions, and the public) to empirically identify the factors affecting the continuous use intention E-Government Services, and conducted an empirical analysis using SPSS and Smart PLS with 284 valid samples except for dual, error and poor answers. Based on the success model of the information system (IS access model), we set independent variables which were divided into quality factors (service quality, system quality, information quality) and risk factors (personal information and security), and perceived ease of use and reliability, which are the main variables based on the technology acceptance model (TAM) that best describes the parameter group, were established as useful parameters. In addition, we design the research model by setting user satisfaction and the continuous use intention as dependent variables, conducted the study about how affecting factors influence to the acceptance factors through 14 hypotheses.The study found that 12 from 14 hypotheses were adopted and 2 were rejected. Looking at the results derived, it was analyzed that, firstly, 3 quality factors all affect perceived ease of use in relation to the quality of service, system quality, information quality which are perceived ease of use of E-Government Services. Second, in relation to the quality of service quality, system quality, information quality and perceived usefulness which are the quality factors of E-Government Services, the quality of service and information quality affect perceived usefulness, but system quality does not affect perceived usefulness. Third, it was analyzed that both factors influence reliability in the relationship between Privacy and security and trust which are risk factors. Fourth, the relationship between perceived ease of use and perceived usefulness has shown that perceived ease of use does not affect perceived usefulness. Finally, the relationship between user value factors (perceptual usability, perceived usefulness and trust) and user satisfaction and the continuous use intention was analyzed that user value factors affect user satisfaction while user satisfaction affects the continuous use intention. This study can be meaningful in that it theoretically presented the factors influencing the continued acceptance of e-government services through precedent research, presented the variables and measurement items verified through the empirical analysis process, and verified the causal relationship between the variables. The e-government service can contribute to the implementation of e-government in line with the era of the 4th Industrial Revolution by using it as a reference to the establishment of policies to improve the quality of people's lives and provide convenient services to the people.
The anonymity and decentralization of Bitcoin make it widely accepted in illegal transactions, such as money laundering, drug and weapon trafficking, gambling, to name a few, which has already caused significant security risk all around the world. The obvious de-anonymity approach that matches transaction addresses and users is not possible in practice due to limited annotated data set. In this paper, we divide addresses into four types, exchange, gambling, service, and general, and propose targeted addresses identification algorithms with high fault tolerance which may be employed in a wide range of applications. We use network representation learning to extract features and train imbalanced multi-classifiers. Experimental results validated the effectiveness of the proposed method.
Most of the data manipulation attacks on deep neural networks (DNNs) during the training stage introduce a perceptible noise that can be catered by preprocessing during inference, or can be identified during the validation phase. There-fore, data poisoning attacks during inference (e.g., adversarial attacks) are becoming more popular. However, many of them do not consider the imperceptibility factor in their optimization algorithms, and can be detected by correlation and structural similarity analysis, or noticeable (e.g., by humans) in multi-level security system. Moreover, majority of the inference attack rely on some knowledge about the training dataset. In this paper, we propose a novel methodology which automatically generates imperceptible attack images by using the back-propagation algorithm on pre-trained DNNs, without requiring any information about the training dataset (i.e., completely training data-unaware). We present a case study on traffic sign detection using the VGGNet trained on the German Traffic Sign Recognition Benchmarks dataset in an autonomous driving use case. Our results demonstrate that the generated attack images successfully perform misclassification while remaining imperceptible in both “subjective” and “objective” quality tests.
This paper presents TrustSign, a novel, trusted automatic malware signature generation method based on high-level deep features transferred from a VGG-19 neural network model pre-trained on the ImageNet dataset. While traditional automatic malware signature generation techniques rely on static or dynamic analysis of the malware's executable, our method overcomes the limitations associated with these techniques by producing signatures based on the presence of the malicious process in the volatile memory. Signatures generated using TrustSign well represent the real malware behavior during runtime. By leveraging the cloud's virtualization technology, TrustSign analyzes the malicious process in a trusted manner, since the malware is unaware and cannot interfere with the inspection procedure. Additionally, by removing the dependency on the malware's executable, our method is capable of signing fileless malware. Thus, we focus our research on in-browser cryptojacking attacks, which current antivirus solutions have difficulty to detect. However, TrustSign is not limited to cryptojacking attacks, as our evaluation included various ransomware samples. TrustSign's signature generation process does not require feature engineering or any additional model training, and it is done in a completely unsupervised manner, obviating the need for a human expert. Therefore, our method has the advantage of dramatically reducing signature generation and distribution time. The results of our experimental evaluation demonstrate TrustSign's ability to generate signatures invariant to the process state over time. By using the signatures generated by TrustSign as input for various supervised classifiers, we achieved 99.5% classification accuracy.
As opposed to a traditional power grid, a smart grid can help utilities to save energy and therefore reduce the cost of operation. It also increases reliability of the system In smart grids the quality of monitoring and control can be adequately improved by incorporating computing and intelligent communication knowledge. However, this exposes the system to false data injection (FDI) attacks and the system becomes vulnerable to intrusions. Therefore, it is important to detect such false data injection attacks and provide an algorithm for the protection of system against such attacks. In this paper a comparison between three FDI detection methods has been made. An H2 control method has then been proposed to detect and control the false data injection on a 12th order model of a smart grid. Disturbances and uncertainties were added to the system and the results show the system to be fully controllable. This paper shows the implementation of a feedback controller to fully detect and mitigate the false data injection attacks. The controller can be incorporated in real life smart grid operations.
In this work, we applied deep semantic analysis, and machine learning and deep learning techniques, to capture inherent characteristics of email text, and classify emails as phishing or non -phishing.
While there has been considerable research on making power grid Supervisory Control and Data Acquisition (SCADA) systems resilient to attacks, the problem of transitioning these technologies into deployed SCADA systems remains largely unaddressed. We describe our experience and lessons learned in deploying an intrusion-tolerant SCADA system in two realistic environments: a red team experiment in 2017 and a power plant test deployment in 2018. These experiences resulted in technical lessons related to developing an intrusion-tolerant system with a real deployable application, preparing a system for deployment in a hostile environment, and supporting protocol assumptions in that hostile environment. We also discuss some meta-lessons regarding the cultural aspects of transitioning academic research into practice in the power industry.