Biblio
In a continually evolving cyber-threat landscape, the detection and prevention of cyber attacks has become a complex task. Technological developments have led organisations to digitise the majority of their operations. This practice, however, has its perils, since cybespace offers a new attack-surface. Institutions which are tasked to protect organisations from these threats utilise mainly network data and their incident response strategy remains oblivious to the needs of the organisation when it comes to protecting operational aspects. This paper presents a system able to combine threat intelligence data, attack-trend data and organisational data (along with other data sources available) in order to achieve automated network-defence actions. Our approach combines machine learning, visual analytics and information from business processes to guide through a decision-making process for a Security Operation Centre environment. We test our system on two synthetic scenarios and show that correlating network data with non-network data for automated network defences is possible and worth investigating further.
Performing large-scale malware classification is increasingly becoming a critical step in malware analytics as the number and variety of malware samples is rapidly growing. Statistical machine learning constitutes an appealing method to cope with this increase as it can use mathematical tools to extract information out of large-scale datasets and produce interpretable models. This has motivated a surge of scientific work in developing machine learning methods for detection and classification of malicious executables. However, an optimal method for extracting the most informative features for different malware families, with the final goal of malware classification, is yet to be found. Fortunately, neural networks have evolved to the state that they can surpass the limitations of other methods in terms of hierarchical feature extraction. Consequently, neural networks can now offer superior classification accuracy in many domains such as computer vision and natural language processing. In this paper, we transfer the performance improvements achieved in the area of neural networks to model the execution sequences of disassembled malicious binaries. We implement a neural network that consists of convolutional and feedforward neural constructs. This architecture embodies a hierarchical feature extraction approach that combines convolution of n-grams of instructions with plain vectorization of features derived from the headers of the Portable Executable (PE) files. Our evaluation results demonstrate that our approach outperforms baseline methods, such as simple Feedforward Neural Networks and Support Vector Machines, as we achieve 93% on precision and recall, even in case of obfuscations in the data.
As the malware threat landscape is constantly evolving and over one million new malware strains are being generated every day [1], early automatic detection of threats constitutes a top priority of cybersecurity research, and amplifies the need for more advanced detection and classification methods that are effective and efficient. In this paper, we present the application of machine learning algorithms to predict the length of time malware should be executed in a sandbox to reveal its malicious intent. We also introduce a novel hybrid approach to malware classification based on static binary analysis and dynamic analysis of malware. Static analysis extracts information from a binary file without executing it, and dynamic analysis captures the behavior of malware in a sandbox environment. Our experimental results show that by turning the aforementioned problems into machine learning problems, it is possible to get an accuracy of up to 90% on the prediction of the malware analysis run time and up to 92% on the classification of malware families.
We present AVAMAT: AntiVirus and Malware Analysis Tool - a tool for analysing the malware detection capabilities of AntiVirus (AV) products running on different operating system (OS) platforms. Even though similar tools are available, such as VirusTotal and MetaDefender, they have several limitations, which motivated the creation of our own tool. With AVAMAT we are able to analyse not only whether an AV detects a malware, but also at what stage of inspection does it detect it and on what OS. AVAMAT enables experimental campaigns to answer various research questions, ranging from the detection capabilities of AVs on OSs, to optimal ways in which AVs could be combined to improve malware detection capabilities.
Malware analysis relies heavily on the use of virtual machines (VMs) for functionality and safety. There are subtle differences in operation between virtual and physical machines. Contemporary malware checks for these differences and changes its behavior when it detects a VM presence. These anti-VM techniques hinder malware analysis. Existing research approaches to uncover differences between VMs and physical machines use randomized testing, and thus cannot guarantee completeness. In this article, we propose a detect-and-hide approach, which systematically addresses anti-VM techniques in malware. First, we propose cardinal pill testing—a modification of red pill testing that aims to enumerate the differences between a given VM and a physical machine through carefully designed tests. Cardinal pill testing finds five times more pills by running 15 times fewer tests than red pill testing. We examine the causes of pills and find that, while the majority of them stem from the failure of VMs to follow CPU specifications, a small number stem from under-specification of certain instructions by the Intel manual. This leads to divergent implementations in different CPU and VM architectures. Cardinal pill testing successfully enumerates the differences that stem from the first cause. Finally, we propose VM Cloak—a WinDbg plug-in which hides the presence of VMs from malware. VM Cloak monitors each execute malware command, detects potential pills, and at runtime modifies the command’s outcomes to match those that a physical machine would generate. We implemented VM Cloak and verified that it successfully hides VM presence from malware.
Malware damages computers and the threat is a serious problem. Malware can be detected by pattern matching method or dynamic heuristic method. However, it is difficult to detect all new malware subspecies perfectly by existing methods. In this paper, we propose a new method which automatically detects new malware subspecies by static analysis of execution files and machine learning. The method can distinguish malware from benignware and it can also classify malware subspecies into malware families. We combine static analysis of execution files with machine learning classifier and natural language processing by machine learning. Information of DLL Import, assembly code and hexdump are acquired by static analysis of execution files of malware and benignware to create feature vectors. Paragraph vectors of information by static analysis of execution files are created by machine learning of PV-DBOW model for natural language processing. Support vector machine and classifier of k-nearest neighbor algorithm are used in our method, and the classifier learns paragraph vectors of information by static analysis. Unknown execution files are classified into malware or benignware by pre-learned SVM. Moreover, malware subspecies are also classified into malware families by pre-learned k-nearest. We evaluate the accuracy of the classification by experiments. We think that new malware subspecies can be effectively detected by our method without existing methods for malware analysis such as generic method and dynamic heuristic method.
Code reuse detection is a key technique in reverse engineering. However, existing source code similarity comparison techniques are not applicable to binary code. Moreover, compilers have made this problem even more difficult due to the fact that different assembly code and control flow structures can be generated by the compilers even when implementing the same functionality. To address this problem, we present a fuzzy matching approach to compare two functions. We first obtain an initial mapping between basic blocks by leveraging the concept of longest common subsequence on the basic block level and execution path level. We then extend the achieved mapping using neighborhood exploration. To make our approach applicable to large data sets, we designed an effective filtering process using Minhashing. Based on the proposed approach, we implemented a tool named BinSequence and conducted extensive experiments with it. Our results show that given a large assembly code repository with millions of functions, BinSequence is efficient and can attain high quality similarity ranking of assembly functions with an accuracy of above 90%. We also present several practical use cases including patch analysis, malware analysis and bug search.
The increasing growth of cybercrimes targeting mobile devices urges an efficient malware analysis platform. With the emergence of evasive malware, which is capable of detecting that it is being analyzed in virtualized environments, bare-metal analysis has become the definitive resort. Existing works mainly focus on extracting the malicious behaviors exposed during bare-metal analysis. However, after malware analysis, it is equally important to quickly restore the system to a clean state to examine the next sample. Unfortunately, state-of-the-art solutions on mobile platforms can only restore the disk, and require a time-consuming system reboot. In addition, all of the existing works require some in-guest components to assist the restoration. Therefore, a kernel-level malware is still able to detect the presence of the in-guest components. We propose Bolt, a transparent restoration mechanism for bare-metal analysis on mobile platform without rebooting. Bolt achieves a reboot-less restoration by simultaneously making a snapshot for both the physical memory and the disk. Memory snapshot is enabled by an isolated operating system (BoltOS) in the ARM TrustZone secure world, and disk snapshot is accomplished by a piece of customized firmware (BoltFTL) for flash-based block devices. Because both the BoltOS and the BoltFTL are isolated from the guest system, even kernel-level malware cannot interfere with the restoration. More importantly, Bolt does not require any modifications into the guest system. As such, Bolt is the first that simultaneously achieves efficiency, isolation, and stealthiness to recover from infection due to malware execution. We have implemented a Bolt prototype working with the Android OS. Experimental results show that Bolt can restore the guest system to a clean state in only 2.80 seconds.
Logic locking has been conceived as a promising proactive defense strategy against intellectual property (IP) piracy, counterfeiting, hardware Trojans, reverse engineering, and overbuilding attacks. Yet, various attacks that use a working chip as an oracle have been launched on logic locking to successfully retrieve its secret key, undermining the defense of all existing locking techniques. In this paper, we propose stripped-functionality logic locking (SFLL), which strips some of the functionality of the design and hides it in the form of a secret key(s), thereby rendering on-chip implementation functionally different from the original one. When loaded onto an on-chip memory, the secret keys restore the original functionality of the design. Through security-aware synthesis that creates a controllable mismatch between the reverse-engineered netlist and original design, SFLL provides a quantifiable and provable resilience trade-off between all known and anticipated attacks. We demonstrate the application of SFLL to large designs (textgreater100K gates) using a computer-aided design (CAD) framework that ensures attaining the desired security level at minimal implementation cost, 8%, 5%, and 0.5% for area, power, and delay, respectively. In addition to theoretical proofs and simulation confirmation of SFLL's security, we also report results from the silicon implementation of SFLL on an ARM Cortex-M0 microprocessor in 65nm technology.
Logic locking is an intellectual property (IP) protection technique that prevents IP piracy, reverse engineering and overbuilding attacks by the untrusted foundry or end-users. Existing logic locking techniques are all based on locking the functionality; the design/chip is nonfunctional unless the secret key has been loaded. Existing techniques are vulnerable to various attacks, such as sensitization, key-pruning, and signal skew analysis enabled removal attacks. In this paper, we propose a tenacious and traceless logic locking technique, TTlock, that locks functionality and provably withstands all known attacks, such as SAT-based, sensitization, removal, etc. TTLock protects a secret input pattern; the output of a logic cone is flipped for that pattern, where this flip is restored only when the correct key is applied. Experimental results confirm our theoretical expectations that the computational complexity of attacks launched on TTLock grows exponentially with increasing key-size, while the area, power, and delay overhead increases only linearly. In this paper, we also coin ``parametric locking," where the design/chip behaves as per its specifications (performance, power, reliability, etc.) only with the secret key in place, and an incorrect key downgrades its parametric characteristics. We discuss objectives and challenges in parametric locking.
Reversible circuits are vulnerable to intellectual property and integrated circuit piracy. To show these vulnerabilities, a detailed understanding on how to identify the function embedded in a reversible circuit is crucial. To obtain the embedded function, one needs to know the synthesis approach used to generate the reversible circuit in the first place. We present a machine learning based scheme to identify the synthesis approach using telltale signs in the design.
Scan-based test is commonly used to increase testability and fault coverage, however, it is also known to be a liability for chip security. Research has shown that intellectual property (IP) or secret keys can be leaked through scan-based attacks. In this paper, we propose a dynamically-obfuscated scan design for protecting IPs against scan-based attacks. By perturbing all test patterns/responses and protecting the obfuscation key, the proposed architecture is proven to be robust against existing non-invasive scan attacks, and can protect all scan data from attackers in foundry, assembly, and system developers (i.e., OEMs) without compromising the testability. Furthermore, the proposed architecture can be easily plugged into EDA generated scan chains without having a noticeable impact on conventional integrated circuit (IC) design, manufacturing, and test flow. Finally, detailed security and experimental analyses have been performed on several benchmarks. The results demonstrate that the proposed method can protect chips from existing brute force, differential, and other scan-based attacks that target the obfuscation key. The proposed design is of low overhead on area, power consumption, and pattern generation time, and there is no impact on test time.
Intellectual property is inextricably linked to the innovative development of mass innovation spaces. The synthetic development of intellectual property and mass innovation spaces will fundamentally support the new economic model of “mass entrepreneurship and innovation”. As such, it is critical to explore intellectual property service standards for mass innovation spaces and to steer mass innovation spaces to the creation of an intellectual property service system catering to “makers”. In addition, it is crucial to explore intellectual cluster management innovations for mass innovation spaces.
The size of counterfeiting activities is increasing day by day. These activities are encountered especially in electronics market. In this paper, a countermeasure against counterfeiting on intellectual properties (IP) on Field-Programmable Gate Arrays (FPGA) is proposed. FPGA vendors provide bitstream ciphering as an IP security solution such as battery-backed or non-volatile FPGAs. However, these solutions are secure as long as they can keep decryption key away from third parties. Key storage and key transfer over unsecure channels expose risks for these solutions. In this work, physical unclonable functions (PUFs) have been used for key generation. Generating a key from a circuit in the device solves key transfer problem. Proposed system goes through different phases when it operates. Therefore, partial reconfiguration feature of FPGAs is essential for feasibility of proposed system.
The trend in computing is towards the use of FPGAs to improve performance at reduced costs. An indication of this is the adoption of FPGAs for data centre and server application acceleration by notable technological giants like Microsoft, Amazon, and Baidu. The continued protection of Intellectual Properties (IPs) on the FPGA has thus become both more important and challenging. To facilitate IP security, FPGA vendors have provided bitstream authentication and encryption. However, advancements in FPGA programming technology have engendered a bitstream manipulation technique like partial bitstream relocation (PBR), which is promising in terms of reducing bitstream storage cost and facilitating adaptability. Meanwhile, encrypted bitstreams are not amenable to PBR. In this paper, we present three methods for performing encrypted PBR with varying overheads of resources and time. These methods ensure that PBR can be applied to bitstreams without losing the protection of IPs.
For the increasing use of internet, it is equally important to protect the intellectual property. And for the protection of copyright, a blind digital watermark algorithm with SVD and OSELM in the IWT domain has been proposed. During the embedding process, SVD has been applied to the coefficient blocks to get the singular values in the IWT domain. Singular values are modulated to embed the watermark in the host image. Online sequential extreme learning machine is trained to learn the relationship between the original coefficient and the corresponding watermarked version. During the extraction process, this trained OSELM is used to extract the embedded watermark logo blindly as no original host image is required during this process. The watermarked image is altered using various attacks like blurring, noise, sharpening, rotation and cropping. The experimental results show that the proposed watermarking scheme is robust against various attacks. The extracted watermark has very much similarity with the original watermark and works good to prove the ownership.