Visible to the public Biblio

Filters: Keyword is taint analysis  [Clear All Filters]
2018-05-02
Chen, Jia, Feng, Yu, Dillig, Isil.  2017.  Precise Detection of Side-Channel Vulnerabilities Using Quantitative Cartesian Hoare Logic. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. :875–890.
This paper presents Themis, an end-to-end static analysis tool for finding resource-usage side-channel vulnerabilities in Java applications. We introduce the notion of epsilon-bounded non-interference, a variant and relaxation of Goguen and Meseguer's well-known non-interference principle. We then present Quantitative Cartesian Hoare Logic (QCHL), a program logic for verifying epsilon-bounded non-interference. Our tool, Themis, combines automated reasoning in CHL with lightweight static taint analysis to improve scalability. We evaluate Themis on well known Java applications and demonstrate that Themis can find unknown side-channel vulnerabilities in widely-used programs. We also show that Themis can verify the absence of vulnerabilities in repaired versions of vulnerable programs and that Themis compares favorably against Blazer, a state-of-the-art static analysis tool for finding timing side channels in Java applications.
Do, Lisa Nguyen Quang, Ali, Karim, Livshits, Benjamin, Bodden, Eric, Smith, Justin, Murphy-Hill, Emerson.  2017.  Just-in-time Static Analysis. Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. :307–317.
We present the concept of Just-In-Time (JIT) static analysis that interleaves code development and bug fixing in an integrated development environment. Unlike traditional batch-style analysis tools, a JIT analysis tool presents warnings to code developers over time, providing the most relevant results quickly, and computing less relevant results incrementally later. In this paper, we describe general guidelines for designing JIT analyses. We also present a general recipe for transforming static data-flow analyses to JIT analyses through a concept of layered analysis execution. We illustrate this transformation through CHEETAH, a JIT taint analysis for Android applications. Our empirical evaluation of CHEETAH on real-world applications shows that our approach returns warnings quickly enough to avoid disrupting the normal workflow of developers. This result is confirmed by our user study, in which developers fixed data leaks twice as fast when using CHEETAH compared to an equivalent batch-style analysis.
Yadegari, Babak, Stephens, Jon, Debray, Saumya.  2017.  Analysis of Exception-Based Control Transfers. Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy. :205–216.
Dynamic taint analysis and symbolic execution find many important applications in security-related program analyses. However, current techniques for such analyses do not take proper account of control transfers due to exceptions. As a result, they can fail to account for implicit flows arising from exception-based control transfers, leading to loss of precision and potential false negatives in analysis results. While the idea of using exceptions for obfuscating (unconditional) control transfers is well known, we are not aware of any prior work discussing the use of exceptions to implement conditional control transfers and implicit information flows. This paper demonstrates the problems that can arise in existing dynamic taint analysis and symbolic execution systems due to exception-based implicit information flows and proposes a generic architecture-agnostic solution for reasoning about the behavior of code using user-defined exception handlers. Experimental results from a prototype implementation indicate that the ideas described produce better results than current state-of-the-art systems.
Korczynski, David, Yin, Heng.  2017.  Capturing Malware Propagations with Code Injections and Code-Reuse Attacks. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. :1691–1708.
Defending against malware involves analysing large amounts of suspicious samples. To deal with such quantities we rely heavily on automatic approaches to determine whether a sample is malicious or not. Unfortunately, complete and precise automatic analysis of malware is far from an easy task. This is because malware is often designed to contain several techniques and countermeasures specifically to hinder analysis. One of these techniques is for the malware to propagate through the operating system so as to execute in the context of benign processes. The malware does this by writing memory to a given process and then proceeds to have this memory execute. In some cases these propagations are trivial to capture because they rely on well-known techniques. However, in the cases where malware deploys novel code injection techniques, rely on code-reuse attacks and potentially deploy dynamically generated code, the problem of capturing a complete and precise view of the malware execution is non-trivial. In this paper we present a unified approach to tracing malware propagations inside the host in the context of code injections and code-reuse attacks. We also present, to the knowledge of the authors, the first approach to identifying dynamically generated code based on information-flow analysis. We implement our techniques in a system called Tartarus and match Tartarus with both synthetic applications and real-world malware. We compare Tartarus to previous works and show that our techniques substantially improve the precision for collecting malware execution traces, and that our approach can capture intrinsic characteristics of novel code injection techniques.
Youssef, Ayman, Shosha, Ahmed F..  2017.  Quantitave Dynamic Taint Analysis of Privacy Leakage in Android Arabic Apps. Proceedings of the 12th International Conference on Availability, Reliability and Security. :58:1–58:9.
Android smartphones are ubiquitous all over the world, and organizations that turn profits out of data mining user personal information are on the rise. Many users are not aware of the risks of accepting permissions from Android apps, and the continued state of insecurity, manifested in increased level of breaches across all large organizations means that personal information is falling in the hands of malicious actors. This paper aims at shedding the light on privacy leakage in apps that target a specific demography, Arabs. The research takes into consideration apps that cater to specific cultural aspects of this region and identify how they could be abusing the trust given to them by unsuspecting users. Dynamic taint analysis is used in a virtualized environment to analyze top free apps based on popularity in Google Play store. Information presented highlights how different categories of apps leak different categories of private information.
2018-02-15
Saoji, Tejas, Austin, Thomas H., Flanagan, Cormac.  2017.  Using Precise Taint Tracking for Auto-sanitization. Proceedings of the 2017 Workshop on Programming Languages and Analysis for Security. :15–24.

Taint analysis has been used in numerous scripting languages such as Perl and Ruby to defend against various form of code injection attacks, such as cross-site scripting (XSS) and SQL-injection. However, most taint analysis systems simply fail when tainted information is used in a possibly unsafe manner. In this paper, we explore how precise taint tracking can be used in order to secure web content. Rather than simply crashing, we propose that a library-writer defined sanitization function can instead be used on the tainted portions of a string. With this approach, library writers or framework developers can design their tools to be resilient, even if inexperienced developers misuse these libraries in unsafe ways. In other words, developer mistakes do not have to result in system crashes to guarantee security. We implement both coarse-grained and precise taint tracking in JavaScript, and show how our precise taint tracking API can be used to defend against SQL injection and XSS attacks. We further evaluate the performance of this approach, showing that precise taint tracking involves an overhead of approximately 22%.

2018-01-23
van der Veen, Victor, Andriesse, Dennis, Stamatogiannakis, Manolis, Chen, Xi, Bos, Herbert, Giuffrdia, Cristiano.  2017.  The Dynamics of Innocent Flesh on the Bone: Code Reuse Ten Years Later. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. :1675–1689.

In 2007, Shacham published a seminal paper on Return-Oriented Programming (ROP), the first systematic formulation of code reuse. The paper has been highly influential, profoundly shaping the way we still think about code reuse today: an attacker analyzes the "geometry" of victim binary code to locate gadgets and chains these to craft an exploit. This model has spurred much research, with a rapid progression of increasingly sophisticated code reuse attacks and defenses over time. After ten years, the common perception is that state-of-the-art code reuse defenses are effective in significantly raising the bar and making attacks exceedingly hard. In this paper, we challenge this perception and show that an attacker going beyond "geometry" (static analysis) and considering the "dynamics" (dynamic analysis) of a victim program can easily find function call gadgets even in the presence of state-of-the-art code-reuse defenses. To support our claims, we present Newton, a run-time gadget-discovery framework based on constraint-driven dynamic taint analysis. Newton can model a broad range of defenses by mapping their properties into simple, stackable, reusable constraints, and automatically generate gadgets that comply with these constraints. Using Newton, we systematically map and compare state-of-the-art defenses, demonstrating that even simple interactions with popular server programs are adequate for finding gadgets for all state-of-the-art code-reuse defenses. We conclude with an nginx case study, which shows that a Newton-enabled attacker can craft attacks which comply with the restrictions of advanced defenses, such as CPI and context-sensitive CFI.

2017-05-30
Sun, Pengfei, Han, Rui, Zhang, Mingbo, Zonouz, Saman.  2016.  Trace-free Memory Data Structure Forensics via Past Inference and Future Speculations. Proceedings of the 32Nd Annual Conference on Computer Security Applications. :570–582.

A yet-to-be-solved but very vital problem in forensics analysis is accurate memory dump data type reverse engineering where the target process is not a priori specified and could be any of the running processes within the system. We present ReViver, a lightweight system-wide solution that extracts data type information from the memory dump without its past execution traces. ReViver constructs the dump's accurate data structure layout through collection of statistical information about possible past traces, forensics inspection of the present memory dump, and speculative investigation of potential future executions of the suspended process. First, ReViver analyzes a heavily instrumented set of execution paths of the same executable that end in the same state of the memory dump (the eip and call stack), and collects statistical information the potential data structure instances on the captured dump. Second, ReViver uses the statistical information and performs a word-byword data type forensics inspection of the captured memory dump. Finally, ReViver revives the dump's execution and explores its potential future execution paths symbolically. ReViver traces the executions including library/system calls for their known argument/return data types, and performs backward taint analysis to mark the dump bytes with relevant data type information. ReViver's experimental results on real-world applications are very promising (98.1%), and show that ReViver improves the accuracy of the past trace-free memory forensics solutions significantly while maintaining a negligible runtime performance overhead (1.8%).

Srinivasan, Venkatesh, Reps, Thomas.  2016.  An Improved Algorithm for Slicing Machine Code. Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. :378–393.

Machine-code slicing is an important primitive for building binary analysis and rewriting tools, such as taint trackers, fault localizers, and partial evaluators. However, it is not easy to create a machine-code slicer that exhibits a high level of precision. Moreover, the problem of creating such a tool is compounded by the fact that a small amount of local imprecision can be amplified via cascade effects. Most instructions in instruction sets such as Intel's IA-32 and ARM are multi-assignments: they have several inputs and several outputs (registers, flags, and memory locations). This aspect of the instruction set introduces a granularity issue during slicing: there are often instructions at which we would like the slice to include only a subset of the instruction's semantics, whereas the slice is forced to include the entire instruction. Consequently, the slice computed by state-of-the-art tools is very imprecise, often including essentially the entire program. This paper presents an algorithm to slice machine code more accurately. To counter the granularity issue, our algorithm performs slicing at the microcode level, instead of the instruction level, and obtains a more precise microcode slice. To reconstitute a machine-code program from a microcode slice, our algorithm uses machine-code synthesis. Our experiments on IA-32 binaries of FreeBSD utilities show that, in comparison to slices computed by a state-of-the-art tool, our algorithm reduces the size of backward slices by 33%, and forward slices by 70%.

Etigowni, Sriharsha, Tian, Dave(Jing), Hernandez, Grant, Zonouz, Saman, Butler, Kevin.  2016.  CPAC: Securing Critical Infrastructure with Cyber-physical Access Control. Proceedings of the 32Nd Annual Conference on Computer Security Applications. :139–152.

Critical infrastructure such as the power grid has become increasingly complex. The addition of computing elements to traditional physical components increases complexity and hampers insight into how elements in the system interact with each other. The result is an infrastructure where operational mistakes, some of which cannot be distinguished from attacks, are more difficult to prevent and have greater potential impact, such as leaking sensitive information to the operator or attacker. In this paper, we present CPAC, a cyber-physical access control solution to manage complexity and mitigate threats in cyber-physical environments, with a focus on the electrical smart grid. CPAC uses information flow analysis based on mathematical models of the physical grid to generate policies enforced through verifiable logic. At the device side, CPAC combines symbolic execution with lightweight dynamic execution monitoring to allow non-intrusive taint analysis on programmable logic controllers in realtime. These components work together to provide a realtime view of all system elements, and allow for more robust and finer-grained protections than any previous solution to securing the grid. We implement a prototype of CPAC using Bachmann PLCs and evaluate several real-world incidents that demonstrate its scalability and effectiveness. The policy checking for a nation-wide grid is less than 150 ms, faster than existing solutions. We additionally show that CPAC can analyze potential component failures for arbitrary component failures, far beyond the capabilities of currently deployed systems. CPAC thus provides a solution to secure the modern smart grid from operator mistakes or insider attacks, maintain operational privacy, and support N - x contingencies.

Höschele, Matthias, Zeller, Andreas.  2016.  Mining Input Grammars from Dynamic Taints. Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. :720–725.

Knowing which part of a program processes which parts of an input can reveal the structure of the input as well as the structure of the program. In a URL textlesspretextgreaterhttp://www.example.com/path/textless/pretextgreater, for instance, the protocol textlesspretextgreaterhttptextless/pretextgreater, the host textlesspretextgreaterwww.example.comtextless/pretextgreater, and the path textlesspretextgreaterpathtextless/pretextgreater would be handled by different functions and stored in different variables. Given a set of sample inputs, we use dynamic tainting to trace the data flow of each input character, and aggregate those input fragments that would be handled by the same function into lexical and syntactical entities. The result is a context-free grammar that reflects valid input structure. In its evaluation, our AUTOGRAM prototype automatically produced readable and structurally accurate grammars for inputs like URLs, spreadsheets or configuration files. The resulting grammars not only allow simple reverse engineering of input formats, but can also directly serve as input for test generators.

Zhai, Juan, Huang, Jianjun, Ma, Shiqing, Zhang, Xiangyu, Tan, Lin, Zhao, Jianhua, Qin, Feng.  2016.  Automatic Model Generation from Documentation for Java API Functions. Proceedings of the 38th International Conference on Software Engineering. :380–391.

Modern software systems are becoming increasingly complex, relying on a lot of third-party library support. Library behaviors are hence an integral part of software behaviors. Analyzing them is as important as analyzing the software itself. However, analyzing libraries is highly challenging due to the lack of source code, implementation in different languages, and complex optimizations. We observe that many Java library functions provide excellent documentation, which concisely describes the functionalities of the functions. We develop a novel technique that can construct models for Java API functions by analyzing the documentation. These models are simpler implementations in Java compared to the original ones and hence easier to analyze. More importantly, they provide the same functionalities as the original functions. Our technique successfully models 326 functions from 14 widely used Java classes. We also use these models in static taint analysis on Android apps and dynamic slicing for Java programs, demonstrating the effectiveness and efficiency of our models.

Wang, Kai, Zhang, Yuqing, Liu, Peng.  2016.  Call Me Back!: Attacks on System Server and System Apps in Android Through Synchronous Callback. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. :92–103.

Android is the most commonly used mobile device operation system. The core of Android, the System Server (SS), is a multi-threaded process that provides most of the system services. Based on a new understanding of the security risks introduced by the callback mechanism in system services, we have discovered a general type of design flaw. A vulnerability detection tool has been designed and implemented based on static taint analysis. We applied the tool on all the 80 system services in the SS of Android 5.1.0. With its help, we have discovered six previously unknown vulnerabilities, which are further confirmed on Android 2.3.7-6.0.1. According to our analysis, about 97.3% of the entire 1.4 billion real-world Android devices are vulnerable. Our proof-of-concept attack proves that the vulnerabilities can enable a malicious app to freeze critical system functionalities or soft-reboot the system immediately. It is a neat type of denial-of-service at-tack. We also proved that the attacks can be conducted at mission critical moments to achieve meaningful goals, such as anti anti-virus, anti process-killer, hindering app updates or system patching. After being informed, Google confirmed our findings promptly. Several suggestions on how to use callbacks safely are also proposed to Google.

Amir-Mohammadian, Sepehr, Skalka, Christian.  2016.  In-Depth Enforcement of Dynamic Integrity Taint Analysis. Proceedings of the 2016 ACM Workshop on Programming Languages and Analysis for Security. :43–56.

Dynamic taint analysis can be used as a defense against low-integrity data in applications with untrusted user interfaces. An important example is defense against XSS and injection attacks in programs with web interfaces. Data sanitization is commonly used in this context, and can be treated as a precondition for endorsement in a dynamic integrity taint analysis. However, sanitization is often incomplete in practice. We develop a model of dynamic integrity taint analysis for Java that addresses imperfect sanitization with an in-depth approach. To avoid false positives, results of sanitization are endorsed for access control (aka prospective security), but are tracked and logged for auditing and accountability (aka retrospective security). We show how this heterogeneous prospective/retrospective mechanism can be specified as a uniform policy, separate from code. We then use this policy to establish correctness conditions for a program rewriting algorithm that instruments code for the analysis. The rewriting itself is a model of existing, efficient Java taint analysis tools.

Gao, Fengjuan, Chen, Tianjiao, Wang, Yu, Situ, Lingyun, Wang, Linzhang, Li, Xuandong.  2016.  Carraybound: Static Array Bounds Checking in C Programs Based on Taint Analysis. Proceedings of the 8th Asia-Pacific Symposium on Internetware. :81–90.

C programming language never performs automatic bounds checking in order to speed up execution. But bounds checking is absolutely necessary in any program. Because if a variable is out-of-bounds, some serious errors may occur during execution, such as endless loop or buffer overflows. When there are arrays used in a program, the index of an array must be within the boundary of the array. But programmers always miss the array bounds checking or do not perform a correct array bounds checking. In this paper, we perform static analysis based on taint analysis and data flow analysis to detect which arrays do not have correct array bounds checking in the program. And we implement an automatic static tool, Carraybound. And the experimental results show that Carraybound can work effectively and efficiently.

Ming, Jiang, Wu, Dinghao, Wang, Jun, Xiao, Gaoyao, Liu, Peng.  2016.  StraightTaint: Decoupled Offline Symbolic Taint Analysis. Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. :308–319.

Taint analysis has been widely applied in ex post facto security applications, such as attack provenance investigation, computer forensic analysis, and reverse engineering. Unfortunately, the high runtime overhead imposed by dynamic taint analysis makes it impractical in many scenarios. The key obstacle is the strict coupling of program execution and taint tracking logic code. To alleviate this performance bottleneck, recent work seeks to offload taint analysis from program execution and run it on a spare core or a different CPU. However, since the taint analysis has heavy data and control dependencies on the program execution, the massive data in recording and transformation overshadow the benefit of decoupling. In this paper, we propose a novel technique to allow very lightweight logging, resulting in much lower execution slowdown, while still permitting us to perform full-featured offline taint analysis. We develop StraightTaint, a hybrid taint analysis tool that completely decouples the program execution and taint analysis. StraightTaint relies on very lightweight logging of the execution information to reconstruct a straight-line code, enabling an offline symbolic taint analysis without frequent data communication with the application. While StraightTaint does not log complete runtime or input values, it is able to precisely identify the causal relationships between sources and sinks, for example. Compared with traditional dynamic taint analysis tools, StraightTaint has much lower application runtime overhead.

2017-05-17
Kwon, Yonghwi, Kim, Dohyeong, Sumner, William Nick, Kim, Kyungtae, Saltaformaggio, Brendan, Zhang, Xiangyu, Xu, Dongyan.  2016.  LDX: Causality Inference by Lightweight Dual Execution. Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems. :503–515.

Causality inference, such as dynamic taint anslysis, has many applications (e.g., information leak detection). It determines whether an event e is causally dependent on a preceding event c during execution. We develop a new causality inference engine LDX. Given an execution, it spawns a slave execution, in which it mutates c and observes whether any change is induced at e. To preclude non-determinism, LDX couples the executions by sharing syscall outcomes. To handle path differences induced by the perturbation, we develop a novel on-the-fly execution alignment scheme that maintains a counter to reflect the progress of execution. The scheme relies on program analysis and compiler transformation. LDX can effectively detect information leak and security attacks with an average overhead of 6.08% while running the master and the slave concurrently on separate CPUs, much lower than existing systems that require instruction level monitoring. Furthermore, it has much better accuracy in causality inference.

2015-05-05
Gupta, M.K., Govil, M.C., Singh, G..  2014.  A context-sensitive approach for precise detection of cross-site scripting vulnerabilities. Innovations in Information Technology (INNOVATIONS), 2014 10th International Conference on. :7-12.

Currently, dependence on web applications is increasing rapidly for social communication, health services, financial transactions and many other purposes. Unfortunately, the presence of cross-site scripting vulnerabilities in these applications allows malicious user to steals sensitive information, install malware, and performs various malicious operations. Researchers proposed various approaches and developed tools to detect XSS vulnerability from source code of web applications. However, existing approaches and tools are not free from false positive and false negative results. In this paper, we propose a taint analysis and defensive programming based HTML context-sensitive approach for precise detection of XSS vulnerability from source code of PHP web applications. It also provides automatic suggestions to improve the vulnerable source code. Preliminary experiments and results on test subjects show that proposed approach is more efficient than existing ones.

2015-05-04
Shao Shuai, Dong Guowei, Guo Tao, Yang Tianchang, Shi Chenjie.  2014.  Analysis on Password Protection in Android Applications. P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC), 2014 Ninth International Conference on. :504-507.

Although there has been much research on the leakage of sensitive data in Android applications, most of the existing research focus on how to detect the malware or adware that are intentionally collecting user privacy. There are not much research on analyzing the vulnerabilities of apps that may cause the leakage of privacy. In this paper, we present a vulnerability analyzing method which combines taint analysis and cryptography misuse detection. The four steps of this method are decompile, taint analysis, API call record, cryptography misuse analysis, all of which steps except taint analysis can be executed by the existing tools. We develop a prototype tool PW Exam to analysis how the passwords are handled and if the app is vulnerable to password leakage. Our experiment shows that a third of apps are vulnerable to leak the users' passwords.

2014-09-26
Schwartz, E.J., Avgerinos, T., Brumley, D..  2010.  All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask). Security and Privacy (SP), 2010 IEEE Symposium on. :317-331.

Dynamic taint analysis and forward symbolic execution are quickly becoming staple techniques in security analyses. Example applications of dynamic taint analysis and forward symbolic execution include malware analysis, input filter generation, test case generation, and vulnerability discovery. Despite the widespread usage of these two techniques, there has been little effort to formally define the algorithms and summarize the critical issues that arise when these techniques are used in typical security contexts. The contributions of this paper are two-fold. First, we precisely describe the algorithms for dynamic taint analysis and forward symbolic execution as extensions to the run-time semantics of a general language. Second, we highlight important implementation choices, common pitfalls, and considerations when using these techniques in a security context.