Biblio
Data races are often hard to detect in device drivers, due to the non-determinism of concurrent execution. According to our study of Linux driver patches that fix data races, more than 38% of patches involve a pattern that we call inconsistent lock protection. Specifically, if a variable is accessed within two concurrently executed functions, the sets of locks held around each access are disjoint, at least one of the locksets is non-empty, and at least one of the involved accesses is a write, then a data race may occur.In this paper, we present a runtime analysis approach, named DILP, to detect data races caused by inconsistent lock protection in device drivers. By monitoring driver execution, DILP collects the information about runtime variable accesses and executed functions. Then after driver execution, DILP analyzes the collected information to detect and report data races caused by inconsistent lock protection. We evaluate DILP on 12 device drivers in Linux 4.16.9, and find 25 real data races.
The Internet of Things is stepping out of its infancy into full maturity, requiring massive data processing and storage. Unfortunately, because of the unique characteristics of resource constraints, short-range communication, and self-organization in IoT, it always resorts to the cloud or fog nodes for outsourced computation and storage, which has brought about a series of novel challenging security and privacy threats. For this reason, one of the critical challenges of having numerous IoT devices is the capacity to manage them and their data. A specific concern is from which devices or Edge clouds to accept join requests or interaction requests. This paper discusses a design concept for developing the IoT data management platform, along with a data management and lineage traceability implementation of the platform based on blockchain and smart contracts, which approaches the two major challenges: how to implement effective data management and enrich rational interoperability for trusted groups of linked Things; And how to settle conflicts between untrusted IoT devices and its requests taking into account security and privacy preserving. Experimental results show that the system scales well with the loss of computing and communication performance maintaining within the acceptable range, works well to effectively defend against unauthorized access and empower data provenance and transparency, which verifies the feasibility and efficiency of the design concept to provide privacy, fine-grained, and integrity data management over the IoT devices by introducing the blockchain-based data management platform.
As a modern power transmission network, smart grid connects plenty of terminal devices. However, along with the growth of devices are the security threats. Different from the previous separated environment, an adversary nowadays can destroy the power system by attacking these devices. Therefore, it's critical to ensure the security and safety of terminal devices. To achieve this goal, detecting the pre-existing vulnerabilities of the device program and enhance the terminal security, are of great importance and necessity. In this paper, we propose a novel approach that detects existing buffer-overflow vulnerabilities of terminal devices via automatic static analysis (ASA). We utilize the static analysis to extract the device program information and build corresponding program models. By further matching the generated program model with pre-defined vulnerability patterns, we achieve vulnerability detection and error reporting. The evaluation results demonstrate that our method can effectively detect buffer-overflow vulnerabilities of smart terminals with a high accuracy and a low false positive rate.
Return Oriented Programming is one of the most important software security challenges nowadays. It exploits memory vulnerabilities to control the state of the program and hijacks its control flow. Existing defenses usually focus on how to protect the control flow or face the challenge of how to maintain the taint markings for memory data. In this paper, we directly focus on the adversary-controlled states, simplify the classic dynamic taint analysis method to only track registers and propose Hardware-based Adversary-controlled States Tracking (HAST). HAST dynamically tracks registers that may be controlled by the adversary to detect ROP attack. It is transparent to user application and makes few modifications to existing hardware. Our evaluation demonstrates that HAST will introduce almost no performance overhead and can effectively detect ROP attacks without false positives on the tested common Linux applications.
With so much our daily lives relying on digital devices like personal computers and cell phones, there is a growing demand for code that not only functions properly, but is secure and keeps user data safe. However, ensuring this is not such an easy task, and many developers do not have the required skills or resources to ensure their code is secure. Many code analysis tools have been written to find vulnerabilities in newly developed code, but this technology tends to produce many false positives, and is still not able to identify all of the problems. Other methods of finding software vulnerabilities automatically are required. This proof-of-concept study applied natural language processing on Java byte code to locate SQL injection vulnerabilities in a Java program. Preliminary findings show that, due to the high number of terms in the dataset, using singular decision trees will not produce a suitable model for locating SQL injection vulnerabilities, while random forest structures proved more promising. Still, further work is needed to determine the best classification tool.
Approaches for the automatic analysis of security policies on source code level cannot trivially be applied to binaries. This is due to the lacking high-level semantics of low-level object code, and the fundamental problem that control-flow recovery from binaries is difficult. We present a novel approach to recover the control-flow of binaries that is both safe and efficient. The key idea of our approach is to use the information contained in security mechanisms to approximate the targets of computed branches. To achieve this, we first define a restricted control transition intermediate language (RCTIL), which restricts the number of possible targets for each branch to a finite number of given targets. Based on this intermediate language, we demonstrate how a safe model of the control flow can be recovered without data-flow analyses. Our evaluation shows that that makes our solution more efficient than existing solutions.
Recently researchers have proposed using deep learning-based systems for malware detection. Unfortunately, all deep learning classification systems are vulnerable to adversarial learning-based attacks, or adversarial attacks, where miscreants can avoid detection by the classification algorithm with very few perturbations of the input data. Previous work has studied adversarial attacks against static analysis-based malware classifiers which only classify the content of the unknown file without execution. However, since the majority of malware is either packed or encrypted, malware classification based on static analysis often fails to detect these types of files. To overcome this limitation, anti-malware companies typically perform dynamic analysis by emulating each file in the anti-malware engine or performing in-depth scanning in a virtual machine. These strategies allow the analysis of the malware after unpacking or decryption. In this work, we study different strategies of crafting adversarial samples for dynamic analysis. These strategies operate on sparse, binary inputs in contrast to continuous inputs such as pixels in images. We then study the effects of two, previously proposed defensive mechanisms against crafted adversarial samples including the distillation and ensemble defenses. We also propose and evaluate the weight decay defense. Experiments show that with these three defenses, the number of successfully crafted adversarial samples is reduced compared to an unprotected baseline system. In particular, the ensemble defense is the most resilient to adversarial attacks. Importantly, none of the defenses significantly reduce the classification accuracy for detecting malware. Finally, we show that while adding additional hidden layers to neural models does not significantly improve the malware classification accuracy, it does significantly increase the classifier's robustness to adversarial attacks.
In the development process of critical systems, one of the main challenges is to provide early system validation and verification against vulnerabilities in order to reduce cost caused by late error detection. We propose in this paper an approach that, firstly allows formally describe system security specifications, thanks to our suggested extended attack tree. Secondly, static and dynamic system modeling by using a SysML connectivity profile to model error propagation is introduced. Finally, a model checker has been used in order to validate system specifications.
This paper aims to explain static analysis techniques in detail, and to highlight the weaknesses and challenges which face it. To this end, more than 80 static analysis-based framework have been studied, and in their light, the process of detecting malicious applications has been divided into four phases that were explained in a schematic manner. Also, the features that is used in static analysis were discussed in detail by dividing it into four categories namely, Manifest-based features, code-based features, semantic features and app's metadata-based features. Also, the challenges facing methods based on static analysis were discussed in detail. Finally, a case study was conducted to test the strength of some known commercial antivirus and one of the stat-of-art academic static analysis frameworks against obfuscation techniques used by developers of malicious applications. The results showed a significant impact on the performance of the most tested antiviruses and frameworks, which is reflecting the urgent need for more accurately tools.
Parfait [1] is a static analysis tool originally developed to find implementation defects in C/C++ systems code. Parfait's focus is on proving both high precision (low false positives) as well as scaling to systems with millions of lines of code (typically requiring 10 minutes of analysis time per million lines). Parfait has since been extended to detect security vulnerabilities in applications code, supporting the Java EE and PL/SQL server stack. In this abstract we describe some of the challenges we encountered in this process including some of the differences seen between the applications code being analysed, our solutions that enable us to analyse a variety of applications, and a summary of the challenges that remain.
The article considers the approach to identifying potentially unsafe data in program code of embedded systems which can lead to errors and fails in the functioning of equipment. The sources of invalid data are revealed and the process of changing the status of this data in process of static code analysis is shown. The mechanism for annotating functions that operate on unsafe data is described, which allows to control the entire process of using them and thus it will improve the quality of the output code.
Fuzzing is a simple yet effective approach to discover software bugs utilizing randomly generated inputs. However, it is limited by coverage and cannot find bugs hidden in deep execution paths of the program because the randomly generated inputs fail complex sanity checks, e.g., checks on magic values, checksums, or hashes. To improve coverage, existing approaches rely on imprecise heuristics or complex input mutation techniques (e.g., symbolic execution or taint analysis) to bypass sanity checks. Our novel method tackles coverage from a different angle: by removing sanity checks in the target program. T-Fuzz leverages a coverage-guided fuzzer to generate inputs. Whenever the fuzzer can no longer trigger new code paths, a light-weight, dynamic tracing based technique detects the input checks that the fuzzer-generated inputs fail. These checks are then removed from the target program. Fuzzing then continues on the transformed program, allowing the code protected by the removed checks to be triggered and potential bugs discovered. Fuzzing transformed programs to find bugs poses two challenges: (1) removal of checks leads to over-approximation and false positives, and (2) even for true bugs, the crashing input on the transformed program may not trigger the bug in the original program. As an auxiliary post-processing step, T-Fuzz leverages a symbolic execution-based approach to filter out false positives and reproduce true bugs in the original program. By transforming the program as well as mutating the input, T-Fuzz covers more code and finds more true bugs than any existing technique. We have evaluated T-Fuzz on the DARPA Cyber Grand Challenge dataset, LAVA-M dataset and 4 real-world programs (pngfix, tiffinfo, magick and pdftohtml). For the CGC dataset, T-Fuzz finds bugs in 166 binaries, Driller in 121, and AFL in 105. In addition, found 3 new bugs in previously-fuzzed programs and libraries.
When implemented on real systems, cryptographic algorithms are vulnerable to attacks observing their execution behavior, such as cache-timing attacks. Designing protected implementations must be done with knowledge and validation tools as early as possible in the development cycle. In this article we propose a methodology to assess the robustness of the candidates for the NIST post-quantum standardization project to cache-timing attacks. To this end we have developed a dedicated vulnerability research tool. It performs a static analysis with tainting propagation of sensitive variables across the source code and detects leakage patterns. We use it to assess the security of the NIST post-quantum cryptography project submissions. Our results show that more than 80% of the analyzed implementations have at least one potential flaw, and three submissions total more than 1000 reported flaws each. Finally, this comprehensive study of the competitors security allows us to identify the most frequent weaknesses amongst candidates and how they might be fixed.
With the development of cyber threats on the Internet, the number of malware, especially unknown malware, is also dramatically increasing. Since all of malware cannot be analyzed by analysts, it is very important to find out new malware that should be analyzed by them. In order to cope with this issue, the existing approaches focused on malware classification using static or dynamic analysis results of malware. However, the static and the dynamic analyses themselves are also too costly and not easy to build the isolated, secure and Internet-like analysis environments such as sandbox. In this paper, we propose a lightweight malware classification method based on detection results of anti-virus software. Since the proposed method can reduce the volume of malware that should be analyzed by analysts, it can be used as a preprocess for in-depth analysis of malware. The experimental showed that the proposed method succeeded in classification of 1,000 malware samples into 187 unique groups. This means that 81% of the original malware samples do not need to analyze by analysts.
Anti-virus vendors receive hundreds of thousands of malware to be analysed each day. Some are new malware while others are variations or evolutions of existing malware. Because analyzing each malware sample by hand is impossible, automated techniques to analyse and categorize incoming samples are needed. In this work, we explore various machine learning features extracted from malware samples through static analysis for classification of malware binaries into already known malware families. We present a new feature based on control statement shingling that has a comparable accuracy to ordinary opcode n-gram based features while requiring smaller dimensions. This, in turn, results in a shorter training time.
This paper presents our results from identifying anddocumenting false positives generated by static code analysistools. By false positives, we mean a static code analysis toolgenerates a warning message, but the warning message isnot really an error. The goal of our study is to understandthe different kinds of false positives generated so we can (1)automatically determine if an error message is truly indeed a truepositive, and (2) reduce the number of false positives developersand testers must triage. We have used two open-source tools andone commercial tool in our study. The results of our study haveled to 14 core false positive patterns, some of which we haveconfirmed with static code analysis tool developers.
The article considers the approach to static analysis of program code and the general principles of static analyzer operation. The authors identify the most important syntactic and semantic information in the programs, which can be used to find errors in the source code. The general methodology for development of diagnostic rules is proposed, which will improve the efficiency of static code analyzers.
Static code analysis is a convenient technique to support the development of software. Without prior test setup, information about a later runtime behavior can be inferred and errors in the code can be found before using a regular compiler. Solutions to apply static code analysis to PLC software following the IEC 61131-3 already exist, but using these separate tools usually creates a gap in the development process. In this paper we introduce an architecture to use static analysis directly in a development environment and give instant feedback to the developer while he is still editing the PLC software.
PHP is one of the most popular web development tools in use today. A major concern though is the improper and insecure uses of the language by application developers, motivating the development of various static analyses that detect security vulnerabilities in PHP programs. However, many of these approaches do not handle recent, important PHP features such as object orientation, which greatly limits the use of such approaches in practice. In this paper, we present OOPIXY, a security analysis tool that extends the PHP security analyzer PIXY to support reasoning about object-oriented features in PHP applications. Our empirical evaluation shows that OOPIXY detects 88% of security vulnerabilities found in micro benchmarks. When used on real-world PHP applications, OOPIXY detects security vulnerabilities that could not be detected using state-of-the-art tools, retaining a high level of precision. We have contacted the maintainers of those applications, and two applications' development teams verified the correctness of our findings. They are currently working on fixing the bugs that lead to those vulnerabilities.
The growing popularity of Android applications makes them vulnerable to security threats. There exist several studies that focus on the analysis of the behaviour of Android applications to detect the repackaged and malicious ones. These techniques use a variety of features to model the application's behaviour, among which the calls to Android API, made by the application components, are shown to be the most reliable. To generate the APIs that an application calls is not an easy task. This is because most malicious applications are obfuscated and do not come with the source code. This makes the problem of identifying the API methods invoked by an application an interesting research issue. In this paper, we present HyDroid, a hybrid approach that combines static and dynamic analysis to generate API call traces from the execution of an application's services. We focus on services because they contain key characteristics that allure attackers to misuse them. We show that HyDroid can be used to extract API call trace signatures of several malware families.
Currently, mobile botnet attacks have shifted from computers to smartphones due to its functionality, ease to exploit, and based on financial intention. Mostly, it attacks Android due to its popularity and high usage among end users. Every day, more and more malicious mobile applications (apps) with the botnet capability have been developed to exploit end users' smartphones. Therefore, this paper presents a new mobile botnet classification based on permission and Application Programming Interface (API) calls in the smartphone. This classification is developed using static analysis in a controlled lab environment and the Drebin dataset is used as the training dataset. 800 apps from the Google Play Store have been chosen randomly to test the proposed classification. As a result, 16 permissions and 31 API calls that are most related with mobile botnet have been extracted using feature selection and later classified and tested using machine learning algorithms. The experimental result shows that the Random Forest Algorithm has achieved the highest detection accuracy of 99.4% with the lowest false positive rate of 16.1% as compared to other machine learning algorithms. This new classification can be used as the input for mobile botnet detection for future work, especially for financial matters.
Mobile apps are widely adopted in daily life, and contain increasing security flaws. Many regulatory agencies and organizations have announced security guidelines for app development. However, most security guidelines involving technicality and compliance with this requirement is not easily feasible. Thus, we propose Mobile Apps Assessment and Analysis System (MAS), an automatic security validation system to improve guideline compliance. MAS combines static and dynamic analysis techniques, which can be used to verify whether android apps meet the security guideline requirements. We implemented MAS in practice and verified 143 real-world apps produced by the Taiwan government. Besides, we also validated 15,000 popular apps collected from Google Play Store produced in three countries. We found that most apps contain at least three security issues. Finally, we summarize the results and list the most common security flaws for consideration in further app development.