Biblio
Modern world has witnessed a dramatic increase in our ability to collect, transmit and distribute real-time monitoring and surveillance data from large-scale information systems and cyber-physical systems. Detecting system anomalies thus attracts significant amount of interest in many fields such as security, fault management, and industrial optimization. Recently, invariant network has shown to be a powerful way in characterizing complex system behaviours. In the invariant network, a node represents a system component and an edge indicates a stable, significant interaction between two components. Structures and evolutions of the invariance network, in particular the vanishing correlations, can shed important light on locating causal anomalies and performing diagnosis. However, existing approaches to detect causal anomalies with the invariant network often use the percentage of vanishing correlations to rank possible casual components, which have several limitations: 1) fault propagation in the network is ignored; 2) the root casual anomalies may not always be the nodes with a high-percentage of vanishing correlations; 3) temporal patterns of vanishing correlations are not exploited for robust detection. To address these limitations, in this paper we propose a network diffusion based framework to identify significant causal anomalies and rank them. Our approach can effectively model fault propagation over the entire invariant network, and can perform joint inference on both the structural, and the time-evolving broken invariance patterns. As a result, it can locate high-confidence anomalies that are truly responsible for the vanishing correlations, and can compensate for unstructured measurement noise in the system. Extensive experiments on synthetic datasets, bank information system datasets, and coal plant cyber-physical system datasets demonstrate the effectiveness of our approach.
Malware detection has been widely studied by analysing either file dropping relationships or characteristics of the file distribution network. This paper, for the first time, studies a global heterogeneous malware delivery graph fusing file dropping relationship and the topology of the file distribution network. The integration offers a unique ability of structuring the end-to-end distribution relationship. However, it brings large heterogeneous graphs to analysis. In our study, an average daily generated graph has more than 4 million edges and 2.7 million nodes that differ in type, such as IPs, URLs, and files. We propose a novel Bayesian label propagation model to unify the multi-source information, including content-agnostic features of different node types and topological information of the heterogeneous network. Our approach does not need to examine the source codes nor inspect the dynamic behaviours of a binary. Instead, it estimates the maliciousness of a given file through a semi-supervised label propagation procedure, which has a linear time complexity w.r.t. the number of nodes and edges. The evaluation on 567 million real-world download events validates that our proposed approach efficiently detects malware with a high accuracy.
Convolution serves as the basic computational primitive for various associative computing tasks ranging from edge detection to image matching. CMOS implementation of such computations entails significant bottlenecks in area and energy consumption due to the large number of multiplication and addition operations involved. In this paper, we propose an ultra-low power and compact hybrid spintronic-CMOS design for the convolution computing unit. Low-voltage operation of domain-wall motion based magneto-metallic "Spin-Memristor"s interfaced with CMOS circuits is able to perform the convolution operation with reasonable accuracy. Simulation results of Gabor filtering for edge detection reveal \textasciitilde 2.5× lower energy consumption compared to a baseline 45nm-CMOS implementation.
The premise of this year's SafeConfig Workshop is existing tools and methods for security assessments are necessary but insufficient for scientifically rigorous testing and evaluation of resilient and active cyber systems. The objective for this workshop is the exploration and discussion of scientifically sound testing regimen(s) that will continuously and dynamically probe, attack, and "test" the various resilient and active technologies. This adaptation and change in focus necessitates at the very least modification, and potentially, wholesale new developments to ensure that resilient- and agile-aware security testing is available to the research community. All testing, validation and experimentation must also be repeatable, reproducible, subject to scientific scrutiny, measurable and meaningful to both researchers and practitioners.
In this study, we present WindTalker, a novel and practical keystroke inference framework that allows an attacker to infer the sensitive keystrokes on a mobile device through WiFi-based side-channel information. WindTalker is motivated from the observation that keystrokes on mobile devices will lead to different hand coverage and the finger motions, which will introduce a unique interference to the multi-path signals and can be reflected by the channel state information (CSI). The adversary can exploit the strong correlation between the CSI fluctuation and the keystrokes to infer the user's number input. WindTalker presents a novel approach to collect the target's CSI data by deploying a public WiFi hotspot. Compared with the previous keystroke inference approach, WindTalker neither deploys external devices close to the target device nor compromises the target device. Instead, it utilizes the public WiFi to collect user's CSI data, which is easy-to-deploy and difficult-to-detect. In addition, it jointly analyzes the traffic and the CSI to launch the keystroke inference only for the sensitive period where password entering occurs. WindTalker can be launched without the requirement of visually seeing the smart phone user's input process, backside motion, or installing any malware on the tablet. We implemented Windtalker on several mobile phones and performed a detailed case study to evaluate the practicality of the password inference towards Alipay, the largest mobile payment platform in the world. The evaluation results show that the attacker can recover the key with a high successful rate.
We develop and evaluate a data hiding method that enables smartphones to encrypt and embed sensitive information into carrier streams of sensor data. Our evaluation considers multiple handsets and a variety of data types, and we demonstrate that our method has a computational cost that allows real-time data hiding on smartphones with negligible distortion of the carrier stream. These characteristics make it suitable for smartphone applications involving privacy-sensitive data such as medical monitoring systems and digital forensics tools.
When filling out privacy-related forms in public places such as hospitals or clinics, people usually are not aware that the sound of their handwriting leaks personal information. In this paper, we explore the possibility of eavesdropping on handwriting via nearby mobile devices based on audio signal processing and machine learning. By presenting a proof-of-concept system, WritingHacker, we show the usage of mobile devices to collect the sound of victims' handwriting, and to extract handwriting-specific features for machine learning based analysis. WritingHacker focuses on the situation where the victim's handwriting follows certain print style. An attacker can keep a mobile device, such as a common smart-phone, touching the desk used by the victim to record the audio signals of handwriting. Then the system can provide a word-level estimate for the content of the handwriting. To reduce the impacts of various writing habits and writing locations, the system utilizes the methods of letter clustering and dictionary filtering. Our prototype system's experimental results show that the accuracy of word recognition reaches around 50% - 60% under certain conditions, which reveals the danger of privacy leakage through the sound of handwriting.
Hardware security has emerged as an important topic in the wake of increasing threats on integrated circuits which include reverse engineering, intellectual property (IP) piracy and overbuilding. This paper explores obfuscation of circuits as a hardware security measure and specifically targets digital signal processing (DSP) circuits which are part of most modern systems. The idea of using desired and undesired modes to design obfuscated DSP functions is illustrated using the fast Fourier transform (FFT) as an example. The selection of a mode is dependent on a key input to the circuit. The system is said to work in its desired mode of operation only if the correct key is applied. Other undesired modes are built into the design to confuse an adversary. The approach to obfuscating the design involves control-flow modifications which alter the computations from the desired mode. We present simulation and synthesis results on a reconfigurable, 2-parallel FFT and discuss the security of this approach. It is shown that the proposed approach results in a reconfigurable and flexible design at an area overhead of 8% and a power overhead of 10%.
In this paper, we address the design an implementation of low power embedded systems for real-time tracking of humans and vehicles. Such systems are important in applications such as activity monitoring and border security. We motivate the utility of mobile devices in prototyping the targeted class of tracking systems, and demonstrate a dataflow-based and cross-platform design methodology that enables efficient experimentation with key aspects of our tracking system design, including real-time operation, experimentation with advanced sensors, and streamlined management of design versions on host and mobile platforms. Our experiments demonstrate the utility of our mobile-device-targeted design methodology in validating tracking algorithm operation; evaluating real-time performance, energy efficiency, and accuracy of tracking system execution; and quantifying trade-offs involving use of advanced sensors, which offer improved sensing accuracy at the expense of increased cost and weight. Additionally, through application of a novel, cross-platform, model-based design approach, our design requires no change in source code when migrating from an initial, host-computer-based functional reference to a fully-functional implementation on the targeted mobile device.
Wearable personal health monitoring systems can offer a cost effective solution for human healthcare. These systems must provide both highly accurate, secured and quick processing and delivery of vast amount of data. In addition, wearable biomedical devices are used in inpatient, outpatient, and at home e-Patient care that must constantly monitor the patient's biomedical and physiological signals 24/7. These biomedical applications require sampling and processing multiple streams of physiological signals with strict power and area footprint. The processing typically consists of feature extraction, data fusion, and classification stages that require a large number of digital signal processing and machine learning kernels. In response to these requirements, in this paper, a low-power, domain-specific many-core accelerator named Power Efficient Nano Clusters (PENC) is proposed to map and execute the kernels of these applications. Experimental results show that the manycore is able to reduce energy consumption by up to 80% and 14% for DSP and machine learning kernels, respectively, when optimally parallelized. The performance of the proposed PENC manycore when acting as a coprocessor to an Intel Atom processor is compared with existing commercial off-the-shelf embedded processing platforms including Intel Atom, Xilinx Artix-7 FPGA, and NVIDIA TK1 ARM-A15 with GPU SoC. The results show that the PENC manycore architecture reduces the energy by as much as 10X while outperforming all off-the-shelf embedded processing platforms across all studied machine learning classifiers.
Ethernet technology dominates enterprise and home network installations and is present in datacenters as well as parts of the backbone of the Internet. Due to its wireline nature, Ethernet networks are often assumed to intrinsically protect the exchanged data against attacks carried out by eavesdroppers and malicious attackers that do not have physical access to network devices, patch panels and network outlets. In this work, we practically evaluate the possibility of wireless attacks against wired Ethernet installations with respect to resistance against eavesdropping by using off-the-shelf software-defined radio platforms. Our results clearly indicate that twisted-pair network cables radiate enough electromagnetic waves to reconstruct transmitted frames with negligible bit error rates, even when the cables are not damaged at all. Since this allows an attacker to stay undetected, it urges the need for link layer encryption or physical layer security to protect confidentiality.
Recurrent neural networks (RNNs) were recently proposed for the session-based recommendation task. The models showed promising improvements over traditional recommendation approaches. In this work, we further study RNN-based models for session-based recommendations. We propose the application of two techniques to improve model performance, namely, data augmentation, and a method to account for shifts in the input data distribution. We also empirically study the use of generalised distillation, and a novel alternative model that directly predicts item embeddings. Experiments on the RecSys Challenge 2015 dataset demonstrate relative improvements of 12.8% and 14.8% over previously reported results on the Recall@20 and Mean Reciprocal Rank@20 metrics respectively.
As the number of small, battery-operated, wireless-enabled devices deployed in various applications of Internet of Things (IoT), Wireless Sensor Networks (WSN), and Cyber-physical Systems (CPS) is rapidly increasing, so is the number of data streams that must be processed. In cases where data do not need to be archived, centrally processed, or federated, in-network data processing is becoming more common. For this purpose, various platforms like DRAGON, Innet, and CJF were proposed. However, these platforms assume that all nodes in the network are the same, i.e. the network is homogeneous. As Moore's law still applies, nodes are becoming smaller, more powerful, and more energy efficient each year; which will continue for the foreseeable future. Therefore, we can expect that as sensor networks are extended and updated, hardware heterogeneity will soon be common in networks - the same trend as can be seen in cloud computing infrastructures. This heterogeneity introduces new challenges in terms of choosing an in-network data processing node, as not only its location, but also its capabilities, must be considered. This paper introduces a new methodology to tackle this challenge, comprising three new algorithms - Request, Traverse, and Mixed - for efficiently locating an in-network data processing node, while taking into account not only position within the network but also hardware capabilities. The proposed algorithms are evaluated against a naïve approach and achieve up to 90% reduction in network traffic during long-term data processing, while spending a similar amount time in the discovery phase.
In recent times, we have seen a proliferation of personal data. This can be attributed not just to a larger proportion of our lives moving online, but also through the rise of ubiquitous sensing through mobile and IoT devices. Alongside this surge, concerns over privacy, trust, and security are expressed more and more as different parties attempt to take advantage of this rich assortment of data. The Databox seeks to enable all the advantages of personal data analytics while at the same time enforcing **accountability** and **control** in order to protect a user's privacy. In this work, we propose and delineate a personal networked device that allows users to **collate**, **curate**, and **mediate** their personal data.
One essential functionality of a modern operating system is to accurately account for the resource usage of the underlying hardware. This is especially important for computing systems that operate on battery power, since energy management requires accurately attributing resource uses to processes. However, components such as sensors, actuators and specialized network interfaces are often used in an asynchronous fashion, and makes it difficult to conduct accurate resource accounting. For example, a process that makes a request to a sensor may not be running on the processor for the full duration of the resource usage; and current mechanisms of resource accounting fail to provide accurate accounting for such asynchronous uses. This paper proposes a new mechanism to accurately account for the asynchronous usage of resources in mobile systems. Our insight is that by accurately relating the user requests with kernel requests to device and corresponding device responses, we can accurately attribute resource use to the requesting process. Our prototype implemented in Linux demonstrates that we can account for the usage of asynchronous resources such as GPS and WiFi accurately.
The huge popularity of online social networks and the potential financial gain have led to the creation and proliferation of zombie accounts, i.e., fake user accounts. For considerable amount of payment, zombie accounts can be directed by their managers to provide pre-arranged biased reactions to different social events or the quality of a commercial product. It is thus critical to detect and screen these accounts. Prior arts are either inaccurate or relying heavily on complex posting/tweeting behaviors in the classification process of normal/zombie accounts. In this work, we propose to use a bi-level penalized logistic classifier, an efficient high-dimensional data analysis technique, to detect zombie accounts based on their publicly available profile information and the statistics of their followers' registration locations. Our approach, termed (B)i-level (P)enalized (LO)gistic (C)lassifier (BPLOC), is data adaptive and can be extended to mount more accurate detections. Our experimental results are based on a small number of SINA WeiBo accounts and have demonstrated that BPLOC can classify zombie accounts accurately.
Social Engineering is a kind of advance persistent threat (APT) that gains private and sensitive information through social networks or other types of communication. The attackers can use social engineering to obtain access into social network accounts and stays there undetected for a long period of time. The purpose of the attack is to steal sensitive data and spread false information rather than to cause direct damage. Such targets can include Facebook accounts of government agencies, corporations, schools or high-profile users. We propose to use IDS, Intrusion Detection System, to battle such attacks. What the social engineering does is try to gain easy access, so that the attacks can be repeated and ongoing. The focus of this study is to find out how this type of attacks are carried out so that they can properly detected by IDS in future research.
SMS (Short Messaging Service) is a text messaging service for mobile users to exchange short text messages. It is also widely used to provide SMS-powered services (e.g., mobile banking). With the rapid deployment of all-IP 4G mobile networks, the underlying technology of SMS evolves from the legacy circuit-switched network to the IMS (IP Multimedia Subsystem) system over packet-switched network. In this work, we study the insecurity of the IMS-based SMS. We uncover its security vulnerabilities and exploit them to devise four SMS attacks: silent SMS abuse, SMS spoofing, SMS client DoS, and SMS spamming. We further discover that those SMS threats can propagate towards SMS-powered services, thereby leading to three malicious attacks: social network account hijacking, unauthorized donation, and unauthorized subscription. Our analysis reveals that the problems stem from the loose security regulations among mobile phones, carrier networks, and SMS-powered services. We finally propose remedies to the identified security issues.
The number of wearable and smart devices which are connecting every day in the Internet of Things (IoT) is continuously growing. We have a great opportunity though to improve the quality of life (QoL) standards by adding medical value to these devices. Especially, by exploiting IoT technology, we have the potential to create useful tools which utilize the sensors to provide biometric data. This novel study aims to use a smartwatch, independent from other hardware, to predict the Blood Pressure (BP) drop caused by postural changes. In cases that the drop is due to orthostatic hypotension (OH) can cause dizziness or even faint factors, which increase the risk of fall in the elderly but, as well as, in younger groups of people. A mathematical prediction model is proposed here which can reduce the risk of fall due to OH by sensing heart rate variability (data and drops in systolic BP after standing in a healthy group of 10 subjects. The experimental results justify the efficiency of the model, as it can perform correct prediction in 86.7% of the cases, and are encouraging enough for extending the proposed approach to pathological cases, such as patients with Parkinson's disease, involving large scale experiments.
Internet of Things (IoT) have been connecting the physical world seamlessly and provides tremendous opportunities to a wide range of applications. However, potential risks exist when IoT system collects local sensor data and uploads to the Cloud. The private data leakage can be severe with curious database administrator or malicious hackers who compromise the Cloud. In this demo, we solve this problem of guaranteeing the user data privacy and security using compressive sensing based cryptographic method. We present CScrypt, a compressive-sensing-based encryption engine for the Cloud-enabled IoT systems to secure the interaction between the IoT devices and the Cloud. Our system exploits the fact that each individual's biometric data can be trained to a unique dictionary which can be used as an encryption key meanwhile to compress the original data. We will demonstrate a functioning prototype of our system using live data stream when attending the conference.
Traditional authentication techniques such as static passwords are vulnerable to replay and guessing attacks. Recently, many studies have been conducted on keystroke dynamics as a promising behavioral biometrics for strengthening user authentication, however, current keystroke based solutions suffer from a numerous number of features with an insufficient number of samples which lead to a high verification error rate and high verification time. In this paper, a keystroke dynamics based authentication system is proposed for cloud environments that supports fixed and free text samples. The proposed system utilizes the ReliefF dimensionality reduction method, as a preprocessing step, to minimize the feature space dimensionality. The proposed system applies clustering to users' profile templates to reduce the verification time. The proposed system is applied to two different benchmark datasets. Experimental results prove the effectiveness and efficiency of the proposed system.
Mobile devices store a diverse set of private user data and have gradually become a hub to control users' other personal Internet-of-Things devices. Access control on mobile devices is therefore highly important. The widely accepted solution is to protect access by asking for a password. However, password authentication is tedious, e.g., a user needs to input a password every time she wants to use the device. Moreover, existing biometrics such as face, fingerprint, and touch behaviors are vulnerable to forgery attacks. We propose a new touch-based biometric authentication system that is passive and secure against forgery attacks. In our touch-based authentication, a user's touch behaviors are a function of some random "secret". The user can subconsciously know the secret while touching the device's screen. However, an attacker cannot know the secret at the time of attack, which makes it challenging to perform forgery attacks even if the attacker has already obtained the user's touch behaviors. We evaluate our touch-based authentication system by collecting data from 25 subjects. Results are promising: the random secrets do not influence user experience and, for targeted forgery attacks, our system achieves 0.18 smaller Equal Error Rates (EERs) than previous touch-based authentication.
Defect-prediction techniques can enhance the quality assurance activities for software systems. For instance, they can be used to predict bugs in source files or functions. In the context of a software product line, such techniques could ideally be used for predicting defects in features or combinations of features, which would allow developers to focus quality assurance on the error-prone ones. In this preliminary case study, we investigate how defect prediction models can be used to identify defective features using machine-learning techniques. We adapt process metrics and evaluate and compare three classifiers using an open-source product line. Our results show that the technique can be effective. Our best scenario achieves an accuracy of 73 % for accurately predicting features as defective or clean using a Naive Bayes classifier. Based on the results we discuss directions for future work.
We have developed the Cross Domain Desktop Compositor, a hardware-based multi-level secure user interface, suitable for deployment in high-assurance environments. Through composition of digital display data from multiple physically-isolated single-level secure domains, and judicious switching of keyboard and mouse input, we provide an integrated multi-domain desktop solution. The system developed enforces a strict information flow policy and requires no trusted software. To fulfil high-assurance requirements and achieve a low cost of accreditation, the architecture favours simplicity, using mainly commercial-off-the-shelf components complemented by small trustworthy hardware elements. The resulting user interface is intuitive and responsive and we show how it can be further leveraged to create integrated multi-level applications and support managed information flows for secure cross domain solutions. This is a new approach to the construction of multi-level secure user interfaces and multi-level applications which minimises the required trusted computing base, whilst maintaining much of the desired functionality.
The infrastructures of Supervisory Control and Data Acquisition (SCADA) systems have evolved through time in order to provide more efficient supervision services. Despite the changes made on SCADA architectures, several enhancements are still required to address the need for: a) large scale supervision using a high number of sensors, b) reduction of the reaction time when a malicious activity is detected; and c) the assurance of a high interoperability between SCADA systems in order to prevent the propagation of incidents. In this context, we propose a novel sensor cloud based SCADA infrastructure to monitor large scale and inter-dependant critical infrastructures, making an effective use of sensor clouds to increase the supervision coverage and the processing time. It ensures also the interoperability between interdependent SCADAs by offering a set of services to SCADA, which are created through the use of templates and are associated to set of virtual sensors. A simulation is conducted to demonstrate the effectiveness of the proposed architecture.