Biblio
Automated network control and management has been a long standing target of network protocols. We address in this paper the question of automated protocol design, where distributed networked nodes have to cooperate to achieve a common goal without a priori knowledge on which information to exchange or the network topology. While reinforcement learning has often been proposed for this task, we propose here to apply recent methods from semi-supervised deep neural networks which are focused on graphs. Our main contribution is an approach for applying graph-based deep learning on distributed routing protocols via a novel neural network architecture named Graph-Query Neural Network. We apply our approach to the tasks of shortest path and max-min routing. We evaluate the learned protocols in cold-start and also in case of topology changes. Numerical results show that our approach is able to automatically develop efficient routing protocols for those two use-cases with accuracies larger than 95%. We also show that specific properties of network protocols, such as resilience to packet loss, can be explicitly included in the learned protocol.
Silicon Physical Unclonable Function (PUF) is arguably the most promising hardware security primitive. In particular, PUFs that are capable of generating a large amount of challenge response pairs (CRPs) can be used in many security applications. However, these CRPs can also be exploited by machine learning attacks to model the PUF and predict its response. In this paper, we first show that, based on data in the public domain, two popular PUFs that can generate CRPs (i.e., arbiter PUF and reconfigurable ring oscillator (RO) PUF) can be broken by simple logistic regression (LR) attack with about 99% accuracy. We then propose a feedback structure to XOR the PUF response with the challenge and challenge the PUF again to generate the response. Results show that this successfully reduces LR's learning accuracy to the lower 50%, but artificial neural network (ANN) learning attack still has an 80% success rate. Therefore, we propose a configurable ring oscillator based dual-mode PUF which works with both odd number of inverters (like the reconfigurable RO PUF) and even number of inverters (like a bistable ring (BR) PUF). Since currently there are no known attacks that can model both RO PUF and BR PUF, the dual-mode PUF will be resistant to modeling attacks as long as we can hide its working mode from the attackers, which we achieve with two practical methods. Finally, we implement the proposed dual-mode PUF on Nexys 4 FPGA boards and collect real measurement to show that it reduces the learning accuracy of LR and ANN to the mid-50% and low 60%, respectively. In addition, it meets the PUF requirements of uniqueness, randomness, and robustness.
The power of artificial neural networks to form predictive models for phenomenon that exhibit non-linear relationships is a given fact. Despite this advantage, artificial neural networks are known to suffer drawbacks such as long training times and computational intensity. The researchers propose a two-tiered approach to enhance the learning performance of artificial neural networks for phenomenon with time series where data exhibits predictable changes that occur every calendar year. This paper focuses on the initial results of the first phase of the proposed algorithm which incorporates clustering and classification prior to application of the backpropagation algorithm. The 2016–2017 zonal load data of France is used as the data set. K-means is chosen as the clustering algorithm and a comparison is made between Naïve Bayes and k-Nearest Neighbors to determine the better classifier for this data set. The initial results show that electrical load behavior is not necessarily reflective of calendar clustering even without using the min-max temperature recorded during the inclusive months. Simulating the day-type classification process using one cluster, initial results show that the k-nearest neighbors outperforms the Naïve Bayes classifier for this data set and that the best feature to be used for classification into day type is the daily min-max load. These classified load data is expected to reduce training time and improve the overall performance of short-term load demand predictive models in a future paper.
Virtualization enables datacenter operators to safely run computations that belong to untrusted tenants. An ideal virtual machine has three properties: a small memory footprint; strong isolation from other VMs and the host OS; and the ability to maintain in-memory state across client requests. Unfortunately, modern virtualization technologies cannot provide all three properties at once. In this paper, we explain why, and propose a new virtualization approach, called Alto, that virtualizes at the layer of a managed runtime interface. Through careful design of (1) the application-facing managed interface and (2) the internal runtime architecture, Alto provides VMs that are small, secure, and stateful. Conveniently, Alto also simplifies VM operations like suspension, migration, and resumption. We provide several details about the proposed design, and discuss the remaining challenges that must be solved to fully realize the Alto vision.
Shielding systems such as AMD's Secure Encrypted Virtualization aim to protect a virtual machine from a higher privileged entity such as the hypervisor. A cornerstone of these systems is the ability to protect the memory from unauthorized accesses. Despite this protection mechanism, previous attacks leveraged the control over memory resources to infer control flow of applications running in a shielded system. While previous works focused on a specific target application, there has been no general analysis on how the control flow of a protected application can be inferred. This paper tries to overcome this gap by providing a detailed analysis on the detectability of control flow using memory access patterns. To that end, we do not focus on a specific shielding system or a specific target application, but present a framework which can be applied to different types of shielding systems as well as to different types of attackers. By training a random forest classifier on the memory accesses emitted by syscalls of a shielded entity, we show that it is possible to infer the control flow of shielded entities with a high degree of accuracy.
ARM has become the leading processor architecture for mobile and IoT devices, while it has recently started claiming a bigger slice of the server market pie as well. As such, it will not be long before malware more regularly target the ARM architecture. Therefore, the stealthy operation of Virtual Machine Introspection (VMI) is an obligation to successfully analyze and proactively mitigate this growing threat. Stealthy VMI has proven itself perfectly suitable for malware analysis on Intel's architecture, yet, it often lacks the foundation required to be equally effective on ARM.
Virtualization offers the possibility of hosting services of multiple customers on shared hardware. When more than one Virtual Machine (VM) run on the same host, memory deduplication can save physical memory by merging identical pages of the VMs. However, this comes at the cost of leaking information between VMs. Based on that, we propose a novel timing-based side-channel attack that allows to identify software versions running in co-resident VMs or on the host. Our attack tests for the existence of memory pages in co-resident VMs that are unique among all versions of the respective software. Our evaluation results indicate that with few repetitions of our attack we can precisely identify software versions within reasonable time frames.
Infrastructure-as-a-Service clouds provide out-of-band remote management for users to access their virtual machines (VMs). Out-of-band remote management is a method for indirectly accessing VMs via their virtual devices. While virtual devices running in the virtualized system are managed by cloud operators, not all cloud operators are always trusted in clouds. To prevent information leakage from virtual devices and tampering with their I/O data, several systems have been proposed by trusting the hypervisor in the virtualized system. However, they have various issues on security and management. This paper proposes VSBypass, which enables secure out-of-band remote management outside the virtualized system using a technique called transparent passthrough. VSBypass runs the entire virtualized system in an outer VM using nested virtualization. Then it intercepts I/O requests of out-of-band remote management and processes those requests in shadow devices, which run outside the virtualized system. We have implemented VSBypass in Xen for the virtual serial console and GUI remote access. We confirmed that information leakage was prevented and that the performance was comparable to that in traditional out-of-band remote management.
Virtual Machine Introspection (VMI) is an emerging family of techniques for extracting data from virtual machines without the use of active monitoring probes within the target machines themselves. In VMI based systems, the data is collected at the hypervisor-level by analyzing the state of virtual machines. This has the benefit of making collection harder to detect and block by malware as there is nothing in the machine indicating that monitoring is taking place. In this paper we present Nitro Web, a web-based monitoring system for virtual machines that uses virtual machine introspection for data collection. The platform is capable of detecting and visualizing system call activity taking place within virtual machines in real-time. The secondary purpose of this paper is to offer an introduction to Nitro virtual machine introspection framework that we have been involved in developing. In this paper, we reflect on how Nitro Framework can be used for building applications making use of VMI data.
Cyber-Physical Systems (CPS), such as Water Distribution Networks (WDNs), deploy digital devices to monitor and control the behavior of physical processes. These digital devices, however, are susceptible to cyber and physical attacks, that may alter their functionality, and therefore the integrity of their measurements/actions. In practice, industrial control systems utilize simple control laws, which rely on various sensor measurements and algorithms which are expected to operate normally. To reduce the impact of a potential failure, operators may deploy redundant components; this however may not be useful, e.g., when a cyber attack at a PLC component occurs. In this work, we address the problem of reducing vulnerability to cyber-physical attacks in water distribution networks. This is achieved by augmenting the graph which describes the information flow from sensors to actuators, by adding new connections and algorithms, to increase the number of redundant cyber components. These, in turn, increase the \textitcyber-physical security level, which is defined in the present paper as the number of malicious attacks a CPS may sustain before becoming unable to satisfy the control requirements. A proof-of-concept of the approach is demonstrated over a simple WDN, with intuition on how this can be used to increase the cyber-physical security level of the system.
An attack detection scheme is proposed to detect data integrity attacks on sensors in Cyber-Physical Systems (CPSs). A combined fingerprint for sensor and process noise is created during the normal operation of the system. Under sensor spoofing attack, noise pattern deviates from the fingerprinted pattern enabling the proposed scheme to detect attacks. To extract the noise (difference between expected and observed value) a representative model of the system is derived. A Kalman filter is used for the purpose of state estimation. By subtracting the state estimates from the real system states, a residual vector is obtained. It is shown that in steady state the residual vector is a function of process and sensor noise. A set of time domain and frequency domain features is extracted from the residual vector. Feature set is provided to a machine learning algorithm to identify the sensor and process. Experiments are performed on two testbeds, a real-world water treatment (SWaT) facility and a water distribution (WADI) testbed. A class of zero-alarm attacks, designed for statistical detectors on SWaT are detected by the proposed scheme. It is shown that a multitude of sensors can be uniquely identified with accuracy higher than 90% based on the noise fingerprint.
We study a dataset of billions of program binary files that appeared on 100 million computers over the course of 12 months, discovering that 94% of these files were present on a single machine. Though malware polymorphism is one cause for the large number of singleton files, additional factors also contribute to polymorphism, given that the ratio of benign to malicious singleton files is 80:1. The huge number of benign singletons makes it challenging to reliably identify the minority of malicious singletons. We present a large-scale study of the properties, characteristics, and distribution of benign and malicious singleton files. We leverage the insights from this study to build a classifier based purely on static features to identify 92% of the remaining malicious singletons at a 1.4% percent false positive rate, despite heavy use of obfuscation and packing techniques by most malicious singleton files that we make no attempt to de-obfuscate. Finally, we demonstrate robustness of our classifier to important classes of automated evasion attacks.
Return-Oriented Programming (ROP) has emerged as one of the most widely used techniques to exploit software vulnerabilities. Unfortunately, existing ROP protections suffer from a number of shortcomings: they require access to source code and compiler support, focus on specific types of gadgets, depend on accurate disassembly and construction of Control Flow Graphs, or use hardware-dependent (microarchitectural) characteristics. In this paper, we propose EigenROP, a novel system to detect ROP payloads based on unsupervised statistical learning of program characteristics. We study, for the first time, the feasibility and effectiveness of using microarchitecture-independent program characteristics – namely, memory locality, register traffic, and memory reuse distance – for detecting ROP. We propose a novel directional statistics based algorithm to identify deviations from the expected program characteristics during execution. EigenROP works transparently to the protected program, without requiring debug information, source code or disassembly. We implemented a dynamic instrumentation prototype of EigenROP using Intel Pin and measured it against in-the-wild ROP exploits and on payloads generated by the ROP compiler ROPC. Overall, EigenROP achieved significantly higher accuracy than prior anomaly-based solutions. It detected the execution of the ROP gadget chains with 81% accuracy, 80% true positive rate, only 0.8% false positive rate, and incurred comparable overhead to similar Pin-based solutions. This article is summarized in: the morning paper an interesting/influential/important paper from the world of CS every weekday morning, as selected by Adrian Colyer
In order to overcome the excessive false detection of marginal noise and the object holes of the existing algorithm in outdoor panoramic surveillance, a moving object detection algorithm based on pixel background sample sets in panoramic scanning mode is proposed. In the light of the space distribution characteristics, neighborhood pixels have similar values. Therefore, a background sample set for each pixel is created by random sampling in the first scanning cycle which effectively avoids the false detection of marginal noise and reduces the time cost of background model establishment. The adjacent frame difference detection algorithm in the traditional camera motion mode is prone to object holes. To solve this problem, detection based on background sample sets is presented to obtain complete moving object region. The results indicate that the proposed moving object detection algorithm works more efficiently on reducing marginal noise interference, and obtains complete moving object information compared with the frame difference detection algorithm based on registration results in traditional camera motion mode, thereby meeting the needs of real-time detection as well as improving its accuracy.
Many network intrusion detection systems use byte sequences to detect lateral movements that exploit remote vulnerabilities. Attackers bypass such detection by stealing valid credentials and using them to transmit from one computer to another without creating abnormal network traffic. We call this method Credential-based Lateral Movement. To detect this type of lateral movement, we develop the concept of a Network Login Structure that specifies normal logins within a given network. Our method models a network login structure by automatically extracting a collection of login patterns by using a variation of the market-basket algorithm. We then employ an anomaly detection approach to detect malicious logins that are inconsistent with the enterprise network's login structure. Evaluations show that the proposed method is able to detect malicious logins in a real setting. In a simulated attack, our system was able to detect 82% of malicious logins, with a 0.3% false positive rate. We used a real dataset of millions of logins over the course of five months within a global financial company for evaluation of this work.
During the recent years there has been an increased focus on preventing and detecting insider attacks and data thefts. A promising approach has been the construction of data loss prevention systems (DLP) that scan outgoing traffic for sensitive data. However, these automated systems are plagued with a high false positive rate. In this paper we introduce the concept of a meta-score that uses the aggregated output from DLP systems to detect and flag behavior indicative of data leakage. The proposed internal/insider threat score is built on the idea of detecting discrepancies between the userassigned sensitivity level and the sensitivity level inferred by the DLP system, and captures the likelihood that a given entity is leaking data. The practical usefulness of the proposed score is demonstrated on the task of identifying likely internal threats.
To detect data races that harm production systems, program analysis must target production runs. However, sound and precise data race detection adds too much run-time overhead for use in production systems. Even existing approaches that provide soundness or precision incur significant limitations. This work addresses the need for soundness (no missed races) and precision (no false races) by introducing novel, efficient production-time analyses that address each need separately. (1) Precise data race detection is useful for developers, who want to fix bugs but loathe false positives. We introduce a precise analysis called RaceChaser that provides low, bounded run-time overhead. (2) Sound race detection benefits analyses and tools whose correctness relies on knowledge of all potential data races. We present a sound, efficient approach called Caper that combines static and dynamic analysis to catch all data races in observed runs. RaceChaser and Caper are useful not only on their own; we introduce a framework that combines these analyses, using Caper as a sound filter for precise data race detection by RaceChaser. Our evaluation shows that RaceChaser and Caper are efficient and effective, and compare favorably with existing state-of-the-art approaches. These results suggest that RaceChaser and Caper enable practical data race detection that is precise and sound, respectively, ultimately leading to more reliable software systems.
We present DECANTeR, a system to detect anomalous outbound HTTP communication, which passively extracts fingerprints for each application running on a monitored host. The goal of our system is to detect unknown malware and backdoor communication indicated by unknown fingerprints extracted from a host's network traffic. We evaluate a prototype with realistic data from an international organization and datasets composed of malicious traffic. We show that our system achieves a false positive rate of 0.9% for 441 monitored host machines, an average detection rate of 97.7%, and that it cannot be evaded by malware using simple evasion techniques such as using known browser user agent values. We compare our solution with DUMONT [24], the current state-of-the-art IDS which detects HTTP covert communication channels by focusing on benign HTTP traffic. The results show that DECANTeR outperforms DUMONT in terms of detection rate, false positive rate, and even evasion-resistance. Finally, DECANTeR detects 96.8% of information stealers in our dataset, which shows its potential to detect data exfiltration.
With Wireless Access in Vehicular Environment (WAVE) finalized for legal enforcement from 2020 after the recent move by the U.S. Government, data plausibility is still an unresolved security issue. In particular, an attacker may forge false position values in safety beacons in order to cause unsafe response from startled receiving vehicles. The data plausibility is a longstanding issue for which various approaches based on sensor fusion, behavior analysis and communication constraints have been proposed, but none of these completely solve the problem. This paper proposes an angle of arrival (AoA) based method to invalidate position forging adversaries such as roadside attackers. Built entirely on the WAVE framework, it can be used even when the traditional sensor fusion-based or behavior-based check is inapplicable. The proposed approach is a completely passive scheme that does not require more than an additional antenna that is strongly recommended for performance anyway.
Nowadays data is always stored in a computer in the hyper-connected world and, a company or an organization or a person can come across financial loss, reputation loss, business disruption and intellectual property loss because of data leakage or data disclosure. Remote Access Trojans are used to invade a victim's PC and collect information from it. There have been signatures for these that have already emerged and defined as malwares, but there is no available signature yet if a malware or a remote access Trojan is a zero-day threat. In this circumstance network behavioral analysis is more useful than signature-based anti-virus scanners in order to detect the different behavior of malware. When the traffic will be cut or stoppedis important in capturing network traffic. In this paper, effective features for detecting RATs are proposed. These features are extracted from the first twenty packets. Our approach achieves 98% accuracy and 10% false negative rate by random forest algorithm.
Traditional security measures for large-scale critical infrastructure systems have focused on keeping adversaries out of the system. As the Internet of Things (IoT) extends into millions of homes, with tens or hundreds of devices each, the threat landscape is complicated. IoT devices have unknown access capabilities with unknown reach into other systems. This paper presents ongoing work on how techniques in sensor verification and cyber-physical modeling and analysis on bulk power systems can be applied to identify malevolent IoT devices and secure smart and connected communities against the most impactful threats.
Hadoop is developed as a distributed data processing platform for analyzing big data. Enterprises can analyze big data containing users' sensitive information by using Hadoop and utilize them for their marketing. Therefore, researches on data encryption have been widely done to protect the leakage of sensitive data stored in Hadoop. However, the existing researches support only the AES international standard data encryption algorithm. Meanwhile, the Korean government selected ARIA algorithm as a standard data encryption scheme for domestic usages. In this paper, we propose a HDFS data encryption scheme which supports both ARIA and AES algorithms on Hadoop. First, the proposed scheme provides a HDFS block-splitting component that performs ARIA/AES encryption and decryption under the Hadoop distributed computing environment. Second, the proposed scheme provides a variable-length data processing component that can perform encryption and decryption by adding dummy data, in case when the last data block does not contains 128-bit data. Finally, we show from performance analysis that our proposed scheme is efficient for various applications, such as word counting, sorting, k-Means, and hierarchical clustering.
In this paper, we propose a CPA-Secure encryption scheme with equality test. Unlike other public key solutions, in our scheme, only the data owner can encrypt the message and get the comparable ciphertext, and only the tester with token who can perform the equality test. Our encryption scheme is based on multiplicative homomorphism of ElGamal Encryption and Non Interactive Zero Knowledge proof of Discrete Log. We proof that the proposed scheme is OW-CPA security under the attack of the adversary who has equality test token, and IND-CPA security under the attack of adversary who can not test the equality. The proposed scheme only suppose to compare two ciphertexts encrypted by same user, though it is less of flexibility, it is efficient and more suitable for data outsourcing scenario.