Biblio
The state-of-the-art Android malware often encrypts or encodes malicious code snippets to evade malware detection. In this paper, such undetectable codes are called Mysterious Codes. To make such codes detectable, we design a system called Droidrevealer to automatically identify Mysterious Codes and then decode or decrypt them. The prototype of Droidrevealer is implemented and evaluated with 5,600 malwares. The results show that 257 samples contain the Mysterious Codes and 11,367 items are exposed. Furthermore, several sensitive behaviors hidden in the Mysterious Codes are disclosed by Droidrevealer.
With Android application packing technology evolving, there are more and more ways to harden APPs. Manually unpacking APPs becomes more difficult as the time needed for analyzing increase exponentially. At the beginning, the packing technology is designed to prevent APPs from being easily decompiled, tampered and re-packed. But unfortunately, many malicious APPs start to use packing service to protect themselves. At present, most of the antivirus software focus on APPs that are unpacked, which means if malicious APPs apply the packing service, they can easily escape from a lot of antivirus software. Therefore, we should not only emphasize the importance of packing, but also concentrate on the unpacking technology. Only by doing this can we protect the normal APPs, and not miss any harmful APPs at the same time. In this paper, we first systematically study a lot of DEX packing and unpacking technologies, then propose and develop a universal unpacking system, named CrackDex, which is capable of extracting the original DEX file from the packed APP. We propose three core technologies: simulation execution, DEX reassembling, and DEX restoration, to get the unpacked DEX file. CrackDex is a part of the Dalvik virtual machine, and it monitors the execution of functions to locate the unpacking point in the portable interpreter, then launches the simulation execution, collects the data of original DEX file through corresponding structure pointer, finally fulfills the unpacking process by reassembling the data collected. The results of our experiments show that CrackDex can be used to effectively unpack APPs that are packed by packing service in a universal approach without any other knowledge of packing service.
Outsourcing services to third-party providers comes with a high security cost-to fully trust the providers. Using trusted hardware can help, but current trusted execution environments do not adequately support services that process very large scale datasets. We present LASTGT, a system that bridges this gap by supporting the execution of self-contained services over a large state, with a small and generic trusted computing base (TCB). LASTGT uses widely deployed trusted hardware to guarantee integrity and verifiability of the execution on a remote platform, and it securely supplies data to the service through simple techniques based on virtual memory. As a result, LASTGT is general and applicable to many scenarios such as computational genomics and databases, as we show in our experimental evaluation based on an implementation of LAST-GT on a secure hypervisor. We also describe a possible implementation on Intel SGX.
Tensor decompositions, which are factorizations of multi-dimensional arrays, are becoming increasingly important in large-scale data analytics. A popular tensor decomposition algorithm is Canonical Decomposition/Parallel Factorization using alternating least squares fitting (CP-ALS). Tensors that model real-world applications are often very large and sparse, driving the need for high performance implementations of decomposition algorithms, such as CP-ALS, that can take advantage of many types of compute resources. In this work we present ReFacTo, a heterogeneous distributed tensor decomposition implementation based on DeFacTo, an existing distributed memory approach to CP-ALS. DFacTo reduces the critical routine of CP-ALS to a series of sparse matrix-vector multiplications (SpMVs). ReFacTo leverages GPUs within a cluster via MPI to perform these SpMVs and uses OpenMP threads to parallelize other routines. We evaluate the performance of ReFacTo when using NVIDIA's GPU-based cuSPARSE library and compare it to an alternative implementation that uses Intel's CPU-based Math Kernel Library (MKL) for the SpMV. Furthermore, we provide a discussion of the performance challenges of heterogeneous distributed tensor decompositions based on the results we observed. We find that on up to 32 nodes, the SpMV of ReFacTo when using MKL is up to 6.8× faster than ReFacTo when using cuSPARSE.
Security protection is a concern for the Internet of Things (IoT) which performs data exchange autonomously over the internet for remote monitoring, automation and other applications. IoT implementations has raised concerns over its security and various research has been conducted to find an effective solution for this. Thus, this work focus on the analysis of an asymmetric encryption scheme, AA-Beta (AAβ) on a platform constrained in terms of processor capability, storage and random access Memory (RAM). For this work, the platform focused is ARM Cortex-M7 microcontroller. The encryption and decryption's performance on the embedded microcontroller is realized and time executed is measured. By enabled the I-Cache (Instruction cache) and D-Cache (Data Cache), the performances are 50% faster compared to disabled the D-Cache and I-Cache. The performance is then compared to our previous work on System on Chip (SoC). This is to analyze the gap of the SoC that has utilized the full GNU Multiple Precision Arithmetic Library (GMP) package versus ARM Cortex-M7 that using the mini-gmp package in term of the footprint and the actual performance.
We present work undertaken at our institutional repository to enhance metadata and re-organize digital objects according to new information architecture, in an effort to minimize administrative object management and processing, and improve object discovery and use. This work was partly motivated by the launch of a new discovery platform at our institution, which aggregates metadata and full text from our four open access repositories into a cohesive, consistent, and enhanced searching and browsing experience. The platform provides digital object identifier (DOI) assignment, metadata access via various formats, and an open metadata and full text application program interface (API) for researchers, amongst other features. Functionality of these platform features relies heavily on accurate object representation and metadata. This work facilitates and improves the discovery and engagement of the diverse digital objects available from our institution, so they can be used and analyzed in new, flexible, and innovative ways by a myriad of communities and disciplines.
In presence of known and unknown vulnerabilities in code and flow control of programs, virtual machine alike isolation and sandboxing to confine maliciousness of process, by monitoring and controlling the behaviour of untrusted application, is an effective strategy. A confined malicious application cannot effect system resources and other applications running on same operating system. But present techniques used for sandboxing have some drawbacks ranging from scope to methodology. Some of proposed techniques restrict specific aspect of execution e.g. system calls and file system access. In the same way techniques that truly isolate the application by providing separate execution environment either require modification in kernel or full blown operating system. Moreover these do not provide isolation from top to bottom but only virtualize operating system services. In this paper, we propose a design to confine native Linux process in virtual machine equivalent isolation by using hardware virtualization extensions with nominal initialization and acceptable execution overheads. We implemented our prototype called Process Virtual Machine that transition a native process into virtual machine, provides minimal possible execution environment, intercept and virtualize system calls to execute it on host kernel. Experimental results show effectiveness of our proposed technique.
Intellectual Property (IP) verification is a crucial component of System-on-Chip (SoC) design in the modern IC design business model. Given a globalized supply chain and an increasing demand for IP reuse, IP theft has become a major concern for the IC industry. In this paper, we address the trust issues that arise between IP owners and IP users during the functional verification of an IP core. Our proposed scheme ensures the privacy of IP owners and users, by a) generating a privacy-preserving version of the IP, which is functionally equivalent to the original design, and b) employing homomorphically encrypted input vectors. This allows the functional verification to be securely outsourced to a third-party, or to be executed by either parties, while revealing the least possible information regarding the test vectors and the IP core. Experiments on both combinational and sequential benchmark circuits demonstrate up to three orders of magnitude IP verification slowdown, due to the computationally intensive fully homomorphic operations, for different security parameter sizes.
One of the main issues in the design of modern integrated circuits is power reduction. Mainly in digital circuits, the power consumption was defined by the dynamic power consumption, during decades. But in the new NanoCMOs technologies, the static power due to the leakage current is becoming the main issue in power consumption. As the leakage power is related to the amount of components, it is becoming mandatory to reduce the amount of transistors in any type of design, to reduce power consumption. So, it is important to obtain new EDA algorithms and tools to optimize the amount of components (transistors). It is also needed tools for the layout design automation that are able to design any network of components that is provided by an optimization tool that is able to reduce the size of the network of components. It is presented an example of a layout design automation tool that can do the layout of any network of transistors using transistors of any size. Another issue for power optimization is the use of tools and algorithms for gate sizing. The designer can manage the sizing of transistors to reduce power consumption, without compromising the clock frequency. There are two types of gate sizing, discrete gate sizing and continuous gate sizing. The discrete gate sizing tools are used when it is being used a cell library that has only few available sizes for each cell. The continuous gate sizing considers that the EDA tool can define any transistor sizing. In this case, the designer needs to have a layout design tool able to do the layout of transistors with any size. It will be presented the winner tools of the ISPD Contest 2012 and 2013. Also, it will be discussed the inclusion of our gate sizing algorithms in an industrial flow used to design state-of-the-art microprocessors. Another type of EDA tool that is becoming more and more useful is the visualization tools that provide an animated visual output of the running of EDA tools. This kind of tools is very usef- l to show to the tool developers how the tool is running. So, the EDA developers can use this information to improve the algorithms used in an EDA Tool.
We propose a methodology for architecture exploration for Cyber-Physical Systems (CPS) based on an iterative, optimization-based approach, where a discrete architecture selection engine is placed in a loop with a continuous sizing engine. The discrete optimization routine proposes a candidate architecture to the sizing engine. The sizing routine optimizes over the continuous parameters using simulation to evaluate the physical models and to monitor the requirements. To decrease the number of simulations, we show how balance equations and conservation laws can be leveraged to prune the discrete space, thus achieving significant reduction in the overall runtime. We demonstrate the effectiveness of our methodology on an industrial case study, namely an aircraft environmental control system, showing more than one order of magnitude reduction in optimization time.
Online fault diagnosis has been a crucial task for industrial processes. Reconstruction-based fault diagnosis has been drawing special attentions as a good alternative to the traditional contribution plot. It identifies the fault cause by finding the specific fault subspace that can well eliminate alarming signals from a bunch of alternatives that have been prepared based on historical fault data. However, in practice, the abnormality may result from the joint effects of multiple faults, which thus can not be well corrected by single fault subspace archived in the historical fault library. In the present work, an aggregative reconstruction-based fault diagnosis strategy is proposed to handle the case where multiple fault causes jointly contribute to the abnormal process behaviors. First, fault subspaces are extracted based on historical fault data in two different monitoring subspaces where analysis of relative changes is taken to enclose the major fault effects that are responsible for different alarming monitoring statistics. Then, a fault subspace selection strategy is developed to analyze the combinatorial fault nature which will sort and select the informative fault subspaces that are most likely to be responsible for the concerned abnormalities. Finally, an aggregative fault subspace is calculated by combining the selected fault subspaces which represents the joint effects from multiple faults and works as the final reconstruction model for online fault diagnosis. Theoretical support is framed and the related statistical characteristics are analyzed. Its feasibility and performance are illustrated with simulated multi-faults using data from the Tennessee Eastman (TE) benchmark process.
In many client-facing applications, a vulnerability in any part can compromise the entire application. This paper describes the design and implementation of Passe, a system that protects a data store from unintended data leaks and unauthorized writes even in the face of application compromise. Passe automatically splits (previously shared-memory-space) applications into sandboxed processes. Passe limits communication between those components and the types of accesses each component can make to shared storage, such as a backend database. In order to limit components to their least privilege, Passe uses dynamic analysis on developer-supplied end-to-end test cases to learn data and control-flow relationships between database queries and previous query results, and it then strongly enforces those relationships. Our prototype of Passe acts as a drop-in replacement for the Django web framework. By running eleven unmodified, off-the-shelf applications in Passe, we demonstrate its ability to provide strong security guarantees-Passe correctly enforced 96% of the applications' policies-with little additional overhead. Additionally, in the web-specific setting of the prototype, we also mitigate the cross-component effects of cross-site scripting (XSS) attacks by combining browser HTML5 sandboxing techniques with our automatic component separation.
Optimizing memory access is critical for performance and power efficiency. CPU manufacturers have developed sampling-based performance measurement units (PMUs) that report precise costs of memory accesses at specific addresses. However, this data is too low-level to be meaningfully interpreted and contains an excessive amount of irrelevant or uninteresting information. We have developed a method to gather fine-grained memory access performance data for specific data objects and regions of code with low overhead and attribute semantic information to the sampled memory accesses. This information provides the context necessary to more effectively interpret the data. We have developed a tool that performs this sampling and attribution and used the tool to discover and diagnose performance problems in real-world applications. Our techniques provide useful insight into the memory behaviour of applications and allow programmers to understand the performance ramifications of key design decisions: domain decomposition, multi-threading, and data motion within distributed memory systems.
In this paper we present WiMesh, a software tool we developed during the last ten years of research conducted in the field of multi-radio wireless mesh networks. WiMesh serves two main purposes: (i) to run different algorithms for the assignment of channels, transmission rate and power to the available network radios; (ii) to automatically setup and run ns-3 simulations based on the network configuration returned by such algorithms. WiMesh basically consists of three libraries and three corresponding utilities that allow to easily conduct experiments. All such utilities accept as input an XML configuration file where a number of options can be specified. WiMesh is freely available to the research community, with the purpose of easing the development of new algorithms and the verification of their performances.
Providers of critical infrastructure services strive to maintain the high availability of their SCADA systems. This paper reports on our experience designing, architecting, and evaluating the first survivable SCADA system-one that is able to ensure correct behavior with minimal performance degradation even during cyber attacks that compromise part of the system. We describe the challenges we faced when integrating modern intrusion-tolerant protocols with a conventional SCADA architecture and present the techniques we developed to overcome these challenges. The results illustrate that our survivable SCADA system not only functions correctly in the face of a cyber attack, but that it also processes in excess of 20 000 messages per second with a latency of less than 30 ms, making it suitable for even large-scale deployments managing thousands of remote terminal units.