Visible to the public Biblio

Filters: Keyword is Computer bugs  [Clear All Filters]
2023-06-09
Qiang, Weizhong, Luo, Hao.  2022.  AutoSlicer: Automatic Program Partitioning for Securing Sensitive Data Based-on Data Dependency Analysis and Code Refactoring. 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :239—247.
Legacy programs are normally monolithic (that is, all code runs in a single process and is not partitioned), and a bug in a program may result in the entire program being vulnerable and therefore untrusted. Program partitioning can be used to separate a program into multiple partitions, so as to isolate sensitive data or privileged operations. Manual program partitioning requires programmers to rewrite the entire source code, which is cumbersome, error-prone, and not generic. Automatic program partitioning tools can separate programs according to the dependency graph constructed based on data or programs. However, programmers still need to manually implement remote service interfaces for inter-partition communication. Therefore, in this paper, we propose AutoSlicer, whose purpose is to partition a program more automatically, so that the programmer is only required to annotate sensitive data. AutoSlicer constructs accurate data dependency graphs (DDGs) by enabling execution flow graphs, and the DDG-based partitioning algorithm can compute partition information based on sensitive annotations. In addition, the code refactoring toolchain can automatically transform the source code into sensitive and insensitive partitions that can be deployed on the remote procedure call framework. The experimental evaluation shows that AutoSlicer can effectively improve the accuracy (13%-27%) of program partitioning by enabling EFG, and separate real-world programs with a relatively smaller performance overhead (0.26%-9.42%).
2023-05-12
Qiu, Zhengyi, Shao, Shudi, Zhao, Qi, Khan, Hassan Ali, Hui, Xinning, Jin, Guoliang.  2022.  A Deep Study of the Effects and Fixes of Server-Side Request Races in Web Applications. 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR). :744–756.

Server-side web applications are vulnerable to request races. While some previous studies of real-world request races exist, they primarily focus on the root cause of these bugs. To better combat request races in server-side web applications, we need a deep understanding of their characteristics. In this paper, we provide a complementary focus on race effects and fixes with an enlarged set of request races from web applications developed with Object-Relational Mapping (ORM) frameworks. We revisit characterization questions used in previous studies on newly included request races, distinguish the external and internal effects of request races, and relate requestrace fixes with concurrency control mechanisms in languages and frameworks for developing server-side web applications. Our study reveals that: (1) request races from ORM-based web applications share the same characteristics as those from raw-SQL web applications; (2) request races violating application semantics without explicit crashes and error messages externally are common, and latent request races, which only corrupt some shared resource internally but require extra requests to expose the misbehavior, are also common; and (3) various fix strategies other than using synchronization mechanisms are used to fix request races. We expect that our results can help developers better understand request races and guide the design and development of tools for combating request races.

ISSN: 2574-3864

2023-04-28
Kudrjavets, Gunnar, Kumar, Aditya, Nagappan, Nachiappan, Rastogi, Ayushi.  2022.  The Unexplored Terrain of Compiler Warnings. 2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). :283–284.
The authors' industry experiences suggest that compiler warnings, a lightweight version of program analysis, are valuable early bug detection tools. Significant costs are associated with patches and security bulletins for issues that could have been avoided if compiler warnings were addressed. Yet, the industry's attitude towards compiler warnings is mixed. Practices range from silencing all compiler warnings to having a zero-tolerance policy as to any warnings. Current published data indicates that addressing compiler warnings early is beneficial. However, support for this value theory stems from grey literature or is anecdotal. Additional focused research is needed to truly assess the cost-benefit of addressing warnings.
2023-04-14
Pise, Rohini, Patil, Sonali.  2022.  A Deep Dive into Blockchain-based Smart Contract-specific Security Vulnerabilities. 2022 IEEE International Conference on Blockchain and Distributed Systems Security (ICBDS). :1–6.
Blockchain smart contracts are prevalent nowadays as numerous applications are developed based on this feature. Though smart contracts are important and widely used, they contain certain vulnerabilities. This paper discusses various security issues that arise in smart contract applications. They are categorized in the smart contract platform, the applications that integrate with the Blockchain, and the vulnerabilities in smart contract code. A detailed study of smart contract-specific vulnerabilities and the defense against those vulnerabilities are presented in this article. Because of certain limitations of platforms or programming language used to write smart contract, there are possibilities of attacks on smart contracts. Hence different security measures or precautions to be taken while writing the smart contract code is discussed in this article. This will prevent the potential attacks happening on Blockchain distributed applications.
2023-03-31
Yang, Jing, Yang, Yibiao, Sun, Maolin, Wen, Ming, Zhou, Yuming, Jin, Hai.  2022.  Isolating Compiler Optimization Faults via Differentiating Finer-grained Options. 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). :481–491.

Code optimization is an essential feature for compilers and almost all software products are released by compiler optimizations. Consequently, bugs in code optimization will inevitably cast significant impact on the correctness of software systems. Locating optimization bugs in compilers is challenging as compilers typically support a large amount of optimization configurations. Although prior studies have proposed to locate compiler bugs via generating witness test programs, they are still time-consuming and not effective enough. To address such limitations, we propose an automatic bug localization approach, ODFL, for locating compiler optimization bugs via differentiating finer-grained options in this study. Specifically, we first disable the fine-grained options that are enabled by default under the bug-triggering optimization levels independently to obtain bug-free and bug-related fine-grained options. We then configure several effective passing and failing optimization sequences based on such fine-grained options to obtain multiple failing and passing compiler coverage. Finally, such generated coverage information can be utilized via Spectrum-Based Fault Localization formulae to rank the suspicious compiler files. We run ODFL on 60 buggy GCC compilers from an existing benchmark. The experimental results show that ODFL significantly outperforms the state-of-the-art compiler bug isolation approach RecBi in terms of all the evaluated metrics, demonstrating the effectiveness of ODFL. In addition, ODFL is much more efficient than RecBi as it can save more than 88% of the time for locating bugs on average.

ISSN: 1534-5351

2023-03-03
Lin, Zhenpeng, Chen, Yueqi, Wu, Yuhang, Mu, Dongliang, Yu, Chensheng, Xing, Xinyu, Li, Kang.  2022.  GREBE: Unveiling Exploitation Potential for Linux Kernel Bugs. 2022 IEEE Symposium on Security and Privacy (SP). :2078–2095.
Nowadays, dynamic testing tools have significantly expedited the discovery of bugs in the Linux kernel. When unveiling kernel bugs, they automatically generate reports, specifying the errors the Linux encounters. The error in the report implies the possible exploitability of the corresponding kernel bug. As a result, many security analysts use the manifested error to infer a bug’s exploitability and thus prioritize their exploit development effort. However, using the error in the report, security researchers might underestimate a bug’s exploitability. The error exhibited in the report may depend upon how the bug is triggered. Through different paths or under different contexts, a bug may manifest various error behaviors implying very different exploitation potentials. This work proposes a new kernel fuzzing technique to explore all the possible error behaviors that a kernel bug might bring about. Unlike conventional kernel fuzzing techniques concentrating on kernel code coverage, our fuzzing technique is more directed towards the buggy code fragment. It introduces an object-driven kernel fuzzing technique to explore various contexts and paths to trigger the reported bug, making the bug manifest various error behaviors. With the newly demonstrated errors, security researchers could better infer a bug’s possible exploitability. To evaluate our proposed technique’s effectiveness, efficiency, and impact, we implement our fuzzing technique as a tool GREBE and apply it to 60 real-world Linux kernel bugs. On average, GREBE could manifest 2+ additional error behaviors for each of the kernel bugs. For 26 kernel bugs, GREBE discovers higher exploitation potential. We report to kernel vendors some of the bugs – the exploitability of which was wrongly assessed and the corresponding patch has not yet been carefully applied – resulting in their rapid patch adoption.
ISSN: 2375-1207
2023-02-17
Vélez, Tatiana Castro, Khatchadourian, Raffi, Bagherzadeh, Mehdi, Raja, Anita.  2022.  Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study. 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR). :469–481.
Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code that supports symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development tends to produce DL code that is error-prone, non-intuitive, and difficult to debug. Consequently, more natural, less error-prone imperative DL frameworks encouraging eager execution have emerged at the expense of run-time performance. While hybrid approaches aim for the “best of both worlds,” the challenges in applying them in the real world are largely unknown. We conduct a data-driven analysis of challenges-and resultant bugs-involved in writing reliable yet performant imperative DL code by studying 250 open-source projects, consisting of 19.7 MLOC, along with 470 and 446 manually examined code patches and bug reports, respectively. The results indicate that hybridization: (i) is prone to API misuse, (ii) can result in performance degradation-the opposite of its intention, and (iii) has limited application due to execution mode incompatibility. We put forth several recommendations, best practices, and anti-patterns for effectively hybridizing imperative DL code, potentially benefiting DL practitioners, API designers, tool developers, and educators.
ISSN: 2574-3864
2023-02-02
Pujar, Saurabh, Zheng, Yunhui, Buratti, Luca, Lewis, Burn, Morari, Alessandro, Laredo, Jim, Postlethwait, Kevin, Görn, Christoph.  2022.  Varangian: A Git Bot for Augmented Static Analysis. 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR). :766–767.

The complexity and scale of modern software programs often lead to overlooked programming errors and security vulnerabilities. Developers often rely on automatic tools, like static analysis tools, to look for bugs and vulnerabilities. Static analysis tools are widely used because they can understand nontrivial program behaviors, scale to millions of lines of code, and detect subtle bugs. However, they are known to generate an excess of false alarms which hinder their utilization as it is counterproductive for developers to go through a long list of reported issues, only to find a few true positives. One of the ways proposed to suppress false positives is to use machine learning to identify them. However, training machine learning models requires good quality labeled datasets. For this purpose, we developed D2A [3], a differential analysis based approach that uses the commit history of a code repository to create a labeled dataset of Infer [2] static analysis output.

2023-01-13
Li, Xiuli, Wang, Guoshi, Wang, Chuping, Qin, Yanyan, Wang, Ning.  2022.  Software Source Code Security Audit Algorithm Supporting Incremental Checking. 2022 IEEE 7th International Conference on Smart Cloud (SmartCloud). :53—58.
Source code security audit is an effective technique to deal with security vulnerabilities and software bugs. As one kind of white-box testing approaches, it can effectively help developers eliminate defects in the code. However, it suffers from performance issues. In this paper, we propose an incremental checking mechanism which enables fast source code security audits. And we conduct comprehensive experiments to verify the effectiveness of our approach.
Chen, Ju, Wang, Jinghan, Song, Chengyu, Yin, Heng.  2022.  JIGSAW: Efficient and Scalable Path Constraints Fuzzing. 2022 IEEE Symposium on Security and Privacy (SP). :18—35.
Coverage-guided testing has shown to be an effective way to find bugs. If we model coverage-guided testing as a search problem (i.e., finding inputs that can cover more branches), then its efficiency mainly depends on two factors: (1) the accuracy of the searching algorithm and (2) the number of inputs that can be evaluated per unit time. Therefore, improving the search throughput has shown to be an effective way to improve the performance of coverage-guided testing.In this work, we present a novel design to improve the search throughput: by evaluating newly generated inputs with JIT-compiled path constraints. This approach allows us to significantly improve the single thread throughput as well as scaling to multiple cores. We also developed several optimization techniques to eliminate major bottlenecks during this process. Evaluation of our prototype JIGSAW shows that our approach can achieve three orders of magnitude higher search throughput than existing fuzzers and can scale to multiple cores. We also find that with such high throughput, a simple gradient-guided search heuristic can solve path constraints collected from a large set of real-world programs faster than SMT solvers with much more sophisticated search heuristics. Evaluation of end-to-end coverage-guided testing also shows that our JIGSAW-powered hybrid fuzzer can outperform state-of-the-art testing tools.
Zhang, Xing, Chen, Jiongyi, Feng, Chao, Li, Ruilin, Diao, Wenrui, Zhang, Kehuan, Lei, Jing, Tang, Chaojing.  2022.  Default: Mutual Information-based Crash Triage for Massive Crashes. 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE). :635—646.
With the considerable success achieved by modern fuzzing in-frastructures, more crashes are produced than ever before. To dig out the root cause, rapid and faithful crash triage for large numbers of crashes has always been attractive. However, hindered by the practical difficulty of reducing analysis imprecision without compromising efficiency, this goal has not been accomplished. In this paper, we present an end-to-end crash triage solution Default, for accurately and quickly pinpointing unique root cause from large numbers of crashes. In particular, we quantify the “crash relevance” of program entities based on mutual information, which serves as the criterion of unique crash bucketing and allows us to bucket massive crashes without pre-analyzing their root cause. The quantification of “crash relevance” is also used in the shortening of long crashing traces. On this basis, we use the interpretability of neural networks to precisely pinpoint the root cause in the shortened traces by evaluating each basic block's impact on the crash label. Evaluated with 20 programs with 22216 crashes in total, Default demonstrates remarkable accuracy and performance, which is way beyond what the state-of-the-art techniques can achieve: crash de-duplication was achieved at a super-fast processing speed - 0.017 seconds per crashing trace, without missing any unique bugs. After that, it identifies the root cause of 43 unique crashes with no false negatives and an average false positive rate of 9.2%.
2022-12-20
Song, Suhwan, Hur, Jaewon, Kim, Sunwoo, Rogers, Philip, Lee, Byoungyoung.  2022.  R2Z2: Detecting Rendering Regressions in Web Browsers through Differential Fuzz Testing. 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE). :1818–1829.
A rendering regression is a bug introduced by a web browser where a web page no longer functions as users expect. Such rendering bugs critically harm the usability of web browsers as well as web applications. The unique aspect of rendering bugs is that they affect the presented visual appearance of web pages, but those web pages have no pre-defined correct appearance. Therefore, it is challenging to automatically detect errors in their appearance. In practice, web browser vendors rely on non-trivial and time-prohibitive manual analysis to detect and handle rendering regressions. This paper proposes R2Z2, an automated tool to find rendering regressions. R2Z2 uses the differential fuzz testing approach, which repeatedly compares the rendering results of two different versions of a browser while providing the same HTML as input. If the rendering results are different, R2Z2 further performs cross browser compatibility testing to check if the rendering difference is indeed a rendering regression. After identifying a rendering regression, R2Z2 will perform an in-depth analysis to aid in fixing the regression. Specifically, R2Z2 performs a delta-debugging-like analysis to pinpoint the exact browser source code commit causing the regression, as well as inspecting the rendering pipeline stages to pinpoint which pipeline stage is responsible. We implemented a prototype of R2Z2 particularly targeting the Chrome browser. So far, R2Z2 found 11 previously undiscovered rendering regressions in Chrome, all of which were confirmed by the Chrome developers. Importantly, in each case, R2Z2 correctly reported the culprit commit. Moreover, R2Z2 correctly pin-pointed the culprit rendering pipeline stage in all but one case.
ISSN: 1558-1225
Cheng, Leixiao, Meng, Fei.  2022.  An Improvement on “CryptCloud$^\textrm+\$$: Secure and Expressive Data Access Control for Cloud Storage”. IEEE Transactions on Services Computing. :1–2.
Recently, Ning et al. proposed the “CryptCloud$^\textrm+\$$: Secure and Expressive Data Access Control for Cloud Storage” in IEEE Transaction on Services Computing. This work provided two versatile ciphertext-policy attribute-based encryption (CP-ABE) schemes to achieve flexible access control on encrypted data, namely ATER-CP-ABE and ATIR-CP-ABE, both of which have attractive advantages, such as white-box malicious user traceability, semi-honest authority accountability, public auditing and user revocation. However, we find a bug of access control in both schemes, i.e., a non-revoked user with attribute set \$S\$ can decrypt the ciphertext \$ct\$ encrypted under any access policy \$(A,\textbackslashrho )\$, regardless of whether \$S\$ satisfies \$(A,\textbackslashrho )\$ or not. This paper carefully analyzes the bug, and makes an improvement on Ning's pioneering work, so as to fix it.
Conference Name: IEEE Transactions on Services Computing
2022-10-16
Lee, Sungho, Lee, Hyogun, Ryu, Sukyoung.  2020.  Broadening Horizons of Multilingual Static Analysis: Semantic Summary Extraction from C Code for JNI Program Analysis. 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). :127–137.
Most programming languages support foreign language interoperation that allows developers to integrate multiple modules implemented in different languages into a single multilingual program. While utilizing various features from multiple languages expands expressivity, differences in language semantics require developers to understand the semantics of multiple languages and their inter-operation. Because current compilers do not support compile-time checking for interoperation, they do not help developers avoid in-teroperation bugs. Similarly, active research on static analysis and bug detection has been focusing on programs written in a single language. In this paper, we propose a novel approach to analyze multilingual programs statically. Unlike existing approaches that extend a static analyzer for a host language to support analysis of foreign function calls, our approach extracts semantic summaries from programs written in guest languages using a modular analysis technique, and performs a whole-program analysis with the extracted semantic summaries. To show practicality of our approach, we design and implement a static analyzer for multilingual programs, which analyzes JNI interoperation between Java and C. Our empirical evaluation shows that the analyzer is scalable in that it can construct call graphs for large programs that use JNI interoperation, and useful in that it found 74 genuine interoperation bugs in real-world Android JNI applications.
Trautsch, Alexander, Herbold, Steffen, Grabowski, Jens.  2020.  Static source code metrics and static analysis warnings for fine-grained just-in-time defect prediction. 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). :127–138.
Software quality evolution and predictive models to support decisions about resource distribution in software quality assurance tasks are an important part of software engineering research. Recently, a fine-grained just-in-time defect prediction approach was proposed which has the ability to find bug-inducing files within changes instead of only complete changes. In this work, we utilize this approach and improve it in multiple places: data collection, labeling and features. We include manually validated issue types, an improved SZZ algorithm which discards comments, whitespaces and refactorings. Additionally, we include static source code metrics as well as static analysis warnings and warning density derived metrics as features. To assess whether we can save cost we incorporate a specialized defect prediction cost model. To evaluate our proposed improvements of the fine-grained just-in-time defect prediction approach we conduct a case study that encompasses 38 Java projects, 492,241 file changes in 73,598 commits and spans 15 years. We find that static source code metrics and static analysis warnings are correlated with bugs and that they can improve the quality and cost saving potential of just-in-time defect prediction models.
Shekarisaz, Mohsen, Talebian, Fatemeh, Jabariani, Marjan, Mehri, Farzad, Faghih, Fathiyeh, Kargahi, Mehdi.  2020.  Program Energy-Hotspot Detection and Removal: A Static Analysis Approach. 2020 CSI/CPSSI International Symposium on Real-Time and Embedded Systems and Technologies (RTEST). :1–8.
The major energy-hungry components in today's battery-operated embedded devices are mostly peripheral modules like LTE, WiFi, GPS, etc. Inefficient use of these modules causes energy hotspots, namely segments of the embedded software in which the module wastes energy. We study two such hotspots in the current paper, and provide the corresponding detection and removal algorithms based on static analysis techniques. The program code hotspots occur due to unnecessary releasing and re-acquiring of a module (which puts the module in power saving mode for a while) and misplaced acquiring of the module (which makes the module or processor to waste energy in idle mode). The detections are performed according to some relation between extreme (worst-case/best-case) execution times of some program segments and time/energy specifications of the module. The experimental results on our benchmarks show about 28 percent of energy reduction after the hotspot removals.
2022-08-12
Liu, Kui, Koyuncu, Anil, Kim, Dongsun, Bissyandè, Tegawende F..  2019.  AVATAR: Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations. 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). :1–12.
Fix pattern-based patch generation is a promising direction in Automated Program Repair (APR). Notably, it has been demonstrated to produce more acceptable and correct patches than the patches obtained with mutation operators through genetic programming. The performance of pattern-based APR systems, however, depends on the fix ingredients mined from fix changes in development histories. Unfortunately, collecting a reliable set of bug fixes in repositories can be challenging. In this paper, we propose to investigate the possibility in an APR scenario of leveraging code changes that address violations by static bug detection tools. To that end, we build the AVATAR APR system, which exploits fix patterns of static analysis violations as ingredients for patch generation. Evaluated on the Defects4J benchmark, we show that, assuming a perfect localization of faults, AVATAR can generate correct patches to fix 34/39 bugs. We further find that AVATAR yields performance metrics that are comparable to that of the closely-related approaches in the literature. While AVATAR outperforms many of the state-of-the-art pattern-based APR systems, it is mostly complementary to current approaches. Overall, our study highlights the relevance of static bug finding tools as indirect contributors of fix ingredients for addressing code defects identified with functional test cases.
Stepanov, Daniil, Akhin, Marat, Belyaev, Mikhail.  2021.  Type-Centric Kotlin Compiler Fuzzing: Preserving Test Program Correctness by Preserving Types. 2021 14th IEEE Conference on Software Testing, Verification and Validation (ICST). :318—328.
Kotlin is a relatively new programming language from JetBrains: its development started in 2010 with release 1.0 done in early 2016. The Kotlin compiler, while slowly and steadily becoming more and more mature, still crashes from time to time on the more tricky input programs, not least because of the complexity of its features and their interactions. This makes it a great target for fuzzing, even the basic forms of which can find a significant number of Kotlin compiler crashes. There is a problem with fuzzing, however, closely related to the cause of the crashes: generating a random, non-trivial and semantically valid Kotlin program is hard. In this paper, we talk about type-centriccompilerfuzzing in the form of type-centricenumeration, an approach inspired by skeletal program enumeration [1] and based on a combination of generative and mutation-based fuzzing, which solves this problem by focusing on program types. After creating the skeleton program, we fill the typed holes with fragments of suitable type, created via generation and enhanced by semantic-aware mutation. We implemented this approach in our Kotlin compiler fuzzing framework called Backend Bug Finder (BBF) and did an extensive evaluation, not only testing the real-world feasibility of our approach, but also comparing it to other compiler fuzzing techniques. The results show our approach to be significantly better compared to other fuzzing approaches at generating semantically valid Kotlin programs, while creating more interesting crash-inducing inputs at the same time. We managed to find more than 50 previously unknown compiler crashes, of which 18 were considered important after their triage by the compiler team.
2022-05-03
Mu, Yanzhou, Wang, Zan, Liu, Shuang, Sun, Jun, Chen, Junjie, Chen, Xiang.  2021.  HARS: Heuristic-Enhanced Adaptive Randomized Scheduling for Concurrency Testing. 2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS). :219—230.

Concurrency programs often induce buggy results due to the unexpected interaction among threads. The detection of these concurrency bugs costs a lot because they usually appear under a specific execution trace. How to virtually explore different thread schedules to detect concurrency bugs efficiently is an important research topic. Many techniques have been proposed, including lightweight techniques like adaptive randomized scheduling (ARS) and heavyweight techniques like maximal causality reduction (MCR). Compared to heavyweight techniques, ARS is efficient in exploring different schedulings and achieves state-of-the-art performance. However, it will lead to explore large numbers of redundant thread schedulings, which will reduce the efficiency. Moreover, it suffers from the “cold start” issue, when little information is available to guide the distance calculation at the beginning of the exploration. In this work, we propose a Heuristic-Enhanced Adaptive Randomized Scheduling (HARS) algorithm, which improves ARS to detect concurrency bugs guided with novel distance metrics and heuristics obtained from existing research findings. Compared with the adaptive randomized scheduling method, it can more effectively distinguish the traces that may contain concurrency bugs and avoid redundant schedules, thus exploring diverse thread schedules effectively. We conduct an evaluation on 45 concurrency Java programs. The evaluation results show that our algorithm performs more stably in terms of effectiveness and efficiency in detecting concurrency bugs. Notably, HARS detects hard-to-expose bugs more effectively, where the buggy traces are rare or the bug triggering conditions are tricky.

2022-04-18
Paul, Rajshakhar, Turzo, Asif Kamal, Bosu, Amiangshu.  2021.  Why Security Defects Go Unnoticed During Code Reviews? A Case-Control Study of the Chromium OS Project 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). :1373–1385.
Peer code review has been found to be effective in identifying security vulnerabilities. However, despite practicing mandatory code reviews, many Open Source Software (OSS) projects still encounter a large number of post-release security vulnerabilities, as some security defects escape those. Therefore, a project manager may wonder if there was any weakness or inconsistency during a code review that missed a security vulnerability. Answers to this question may help a manager pinpointing areas of concern and taking measures to improve the effectiveness of his/her project's code reviews in identifying security defects. Therefore, this study aims to identify the factors that differentiate code reviews that successfully identified security defects from those that missed such defects. With this goal, we conduct a case-control study of Chromium OS project. Using multi-stage semi-automated approaches, we build a dataset of 516 code reviews that successfully identified security defects and 374 code reviews where security defects escaped. The results of our empirical study suggest that the are significant differences between the categories of security defects that are identified and that are missed during code reviews. A logistic regression model fitted on our dataset achieved an AUC score of 0.91 and has identified nine code review attributes that influence identifications of security defects. While time to complete a review, the number of mutual reviews between two developers, and if the review is for a bug fix have positive impacts on vulnerability identification, opposite effects are observed from the number of directories under review, the number of total reviews by a developer, and the total number of prior commits for the file under review.
2022-04-12
Redini, Nilo, Continella, Andrea, Das, Dipanjan, De Pasquale, Giulio, Spahn, Noah, Machiry, Aravind, Bianchi, Antonio, Kruegel, Christopher, Vigna, Giovanni.  2021.  Diane: Identifying Fuzzing Triggers in Apps to Generate Under-constrained Inputs for IoT Devices. 2021 IEEE Symposium on Security and Privacy (SP). :484—500.
Internet of Things (IoT) devices have rooted themselves in the everyday life of billions of people. Thus, researchers have applied automated bug finding techniques to improve their overall security. However, due to the difficulties in extracting and emulating custom firmware, black-box fuzzing is often the only viable analysis option. Unfortunately, this solution mostly produces invalid inputs, which are quickly discarded by the targeted IoT device and do not penetrate its code. Another proposed approach is to leverage the companion app (i.e., the mobile app typically used to control an IoT device) to generate well-structured fuzzing inputs. Unfortunately, the existing solutions produce fuzzing inputs that are constrained by app-side validation code, thus significantly limiting the range of discovered vulnerabilities.In this paper, we propose a novel approach that overcomes these limitations. Our key observation is that there exist functions inside the companion app that can be used to generate optimal (i.e., valid yet under-constrained) fuzzing inputs. Such functions, which we call fuzzing triggers, are executed before any data-transforming functions (e.g., network serialization), but after the input validation code. Consequently, they generate inputs that are not constrained by app-side sanitization code, and, at the same time, are not discarded by the analyzed IoT device due to their invalid format. We design and develop Diane, a tool that combines static and dynamic analysis to find fuzzing triggers in Android companion apps, and then uses them to fuzz IoT devices automatically. We use Diane to analyze 11 popular IoT devices, and identify 11 bugs, 9 of which are zero days. Our results also show that without using fuzzing triggers, it is not possible to generate bug-triggering inputs for many devices.
2022-03-14
Mambretti, Andrea, Sandulescu, Alexandra, Sorniotti, Alessandro, Robertson, William, Kirda, Engin, Kurmus, Anil.  2021.  Bypassing memory safety mechanisms through speculative control flow hijacks. 2021 IEEE European Symposium on Security and Privacy (EuroS P). :633–649.
The prevalence of memory corruption bugs in the past decades resulted in numerous defenses, such as stack canaries, control flow integrity (CFI), and memory-safe languages. These defenses can prevent entire classes of vulnerabilities, and help increase the security posture of a program. In this paper, we show that memory corruption defenses can be bypassed using speculative execution attacks. We study the cases of stack protectors, CFI, and bounds checks in Go, demonstrating under which conditions they can be bypassed by a form of speculative control flow hijack, relying on speculative or architectural overwrites of control flow data. Information is leaked by redirecting the speculative control flow of the victim to a gadget accessing secret data and acting as a side channel send. We also demonstrate, for the first time, that this can be achieved by stitching together multiple gadgets, in a speculative return-oriented programming attack. We discuss and implement software mitigations, showing moderate performance impact.
McQuistin, Stephen, Band, Vivian, Jacob, Dejice, Perkins, Colin.  2021.  Investigating Automatic Code Generation for Network Packet Parsing. 2021 IFIP Networking Conference (IFIP Networking). :1—9.
Use of formal protocol description techniques and code generation can reduce bugs in network packet parsing code. However, such techniques are themselves complex, and don't see wide adoption in the protocol standards development community, where the focus is on consensus building and human-readable specifications. We explore the utility and effectiveness of new techniques for describing protocol data, specifically designed to integrate with the standards development process, and discuss how they can be used to generate code that is safer and more trustworthy, while maintaining correctness and performance.
Tempel, Sören, Herdt, Vladimir, Drechsler, Rolf.  2021.  Towards Reliable Spatial Memory Safety for Embedded Software by Combining Checked C with Concolic Testing. 2021 58th ACM/IEEE Design Automation Conference (DAC). :667—672.
In this paper we propose to combine the safe C dialect Checked C with concolic testing to obtain an effective methodology for attaining safer C code. Checked C is a modern and backward compatible extension to the C programming language which provides facilities for writing memory-safe C code. We utilize incremental conversions of unsafe C software to Checked C. After each increment, we leverage concolic testing, an effective test generation technique, to support the conversion process by searching for newly introduced and existing bugs.Our RISC-V experiments using the RIOT Operating System (OS) demonstrate the effectiveness of our approach. We uncovered 4 previously unknown bugs and 3 bugs accidentally introduced through our conversion process.
2022-02-24
Hess, Andreas V., Mödersheim, Sebastian, Brucker, Achim D., Schlichtkrull, Anders.  2021.  Performing Security Proofs of Stateful Protocols. 2021 IEEE 34th Computer Security Foundations Symposium (CSF). :1–16.
In protocol verification we observe a wide spectrum from fully automated methods to interactive theorem proving with proof assistants like Isabelle/HOL. The latter provide overwhelmingly high assurance of the correctness, which automated methods often cannot: due to their complexity, bugs in such automated verification tools are likely and thus the risk of erroneously verifying a flawed protocol is non-negligible. There are a few works that try to combine advantages from both ends of the spectrum: a high degree of automation and assurance. We present here a first step towards achieving this for a more challenging class of protocols, namely those that work with a mutable long-term state. To our knowledge this is the first approach that achieves fully automated verification of stateful protocols in an LCF-style theorem prover. The approach also includes a simple user-friendly transaction-based protocol specification language embedded into Isabelle, and can also leverage a number of existing results such as soundness of a typed model