Biblio | CPS-VO

Sikder, Md Nazmul Kabir, Batarseh, Feras A., Wang, Pei, Gorentala, Nitish. 2022. Model-Agnostic Scoring Methods for Artificial Intelligence Assurance. 2022 IEEE 29th Annual Software Technology Conference (STC). :9–18.

State of the art Artificial Intelligence Assurance (AIA) methods validate AI systems based on predefined goals and standards, are applied within a given domain, and are designed for a specific AI algorithm. Existing works do not provide information on assuring subjective AI goals such as fairness and trustworthiness. Other assurance goals are frequently required in an intelligent deployment, including explainability, safety, and security. Accordingly, issues such as value loading, generalization, context, and scalability arise; however, achieving multiple assurance goals without major trade-offs is generally deemed an unattainable task. In this manuscript, we present two AIA pipelines that are model-agnostic, independent of the domain (such as: healthcare, energy, banking), and provide scores for AIA goals including explainability, safety, and security. The two pipelines: Adversarial Logging Scoring Pipeline (ALSP) and Requirements Feedback Scoring Pipeline (RFSP) are scalable and tested with multiple use cases, such as a water distribution network and a telecommunications network, to illustrate their benefits. ALSP optimizes models using a game theory approach and it also logs and scores the actions of an AI model to detect adversarial inputs, and assures the datasets used for training. RFSP identifies the best hyper-parameters using a Bayesian approach and provides assurance scores for subjective goals such as ethical AI using user inputs and statistical assurance measures. Each pipeline has three algorithms that enforce the final assurance scores and other outcomes. Unlike ALSP (which is a parallel process), RFSP is user-driven and its actions are sequential. Data are collected for experimentation; the results of both pipelines are presented and contrasted.

Wang, Pei, Bangert, Julian, Kern, Christoph. 2021. If It’s Not Secure, It Should Not Compile: Preventing DOM-Based XSS in Large-Scale Web Development with API Hardening. 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). :1360–1372.

With tons of efforts spent on its mitigation, Cross-site scripting (XSS) remains one of the most prevalent security threats on the internet. Decades of exploitation and remediation demonstrated that code inspection and testing alone does not eliminate XSS vulnerabilities in complex web applications with a high degree of confidence. This paper introduces Google's secure-by-design engineering paradigm that effectively prevents DOM-based XSS vulnerabilities in large-scale web development. Our approach, named API hardening, enforces a series of company-wide secure coding practices. We provide a set of secure APIs to replace native DOM APIs that are prone to XSS vulnerabilities. Through a combination of type contracts and appropriate validation and escaping, the secure APIs ensure that applications based thereon are free of XSS vulnerabilities. We deploy a simple yet capable compile-time checker to guarantee that developers exclusively use our hardened APIs to interact with the DOM. We make various of efforts to scale this approach to tens of thousands of engineers without significant productivity impact. By offering rigorous tooling and consultant support, we help developers adopt the secure coding practices as seamlessly as possible. We present empirical results showing how API hardening has helped reduce the occurrences of XSS vulnerabilities in Google's enormous code base over the course of two-year deployment.

Wang, Pei, Bangert, Julian, Kern, Christoph. 2021. If It's Not Secure, It Should Not Compile: Preventing DOM-Based XSS in Large-Scale Web Development with API Hardening. 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). :1360–1372.

With tons of efforts spent on its mitigation, Cross-site scripting (XSS) remains one of the most prevalent security threats on the internet. Decades of exploitation and remediation demonstrated that code inspection and testing alone does not eliminate XSS vulnerabilities in complex web applications with a high degree of confidence. This paper introduces Google's secure-by-design engineering paradigm that effectively prevents DOM-based XSS vulnerabilities in large-scale web development. Our approach, named API hardening, enforces a series of company-wide secure coding practices. We provide a set of secure APIs to replace native DOM APIs that are prone to XSS vulnerabilities. Through a combination of type contracts and appropriate validation and escaping, the secure APIs ensure that applications based thereon are free of XSS vulnerabilities. We deploy a simple yet capable compile-time checker to guarantee that developers exclusively use our hardened APIs to interact with the DOM. We make various of efforts to scale this approach to tens of thousands of engineers without significant productivity impact. By offering rigorous tooling and consultant support, we help developers adopt the secure coding practices as seamlessly as possible. We present empirical results showing how API hardening has helped reduce the occurrences of XSS vulnerabilities in Google's enormous code base over the course of two-year deployment.

Wang, Pei, Guðmundsson, Bjarki Ágúst, Kotowicz, Krzysztof. 2021. Adopting Trusted Types in ProductionWeb Frameworks to Prevent DOM-Based Cross-Site Scripting: A Case Study. 2021 IEEE European Symposium on Security and Privacy Workshops (EuroS PW). :60–73.

Cross-site scripting (XSS) is a common security vulnerability found in web applications. DOM-based XSS, one of the variants, is becoming particularly more prevalent with the boom of single-page applications where most of the UI changes are achieved by modifying the DOM through in-browser scripting. It is very easy for developers to introduce XSS vulnerabilities into web applications since there are many ways for user-controlled, unsanitized input to flow into a Web API and get interpreted as HTML markup and JavaScript code. An emerging Web API proposal called Trusted Types aims to prevent DOM XSS by making Web APIs secure by default. Different from other XSS mitigations that mostly focus on post-development protection, Trusted Types direct developers to write XSS-free code in the first place. A common concern when adopting a new security mechanism is how much effort is required to refactor existing code bases. In this paper, we report a case study on adopting Trusted Types in a well-established web framework. Our experience can help the web community better understand the benefits of making web applications compatible with Trusted Types, while also getting to know the related challenges and resolutions. We focused our work on Angular, which is one of the most popular web development frameworks available on the market.

Wang, Shuai, Wang, Wenhao, Bao, Qinkun, Wang, Pei, Wang, XiaoFeng, Wu, Dinghao. 2017. Binary Code Retrofitting and Hardening Using SGX. Proceedings of the 2017 Workshop on Forming an Ecosystem Around Software Transformation. :43–49.

Trusted Execution Environment (TEE) is designed to deliver a safe execution environment for software systems. Intel Software Guard Extensions (SGX) provides isolated memory regions (i.e., SGX enclaves) to protect code and data from adversaries in the untrusted world. While existing research has proposed techniques to execute entire executable files inside enclave instances by providing rich sets of OS facilities, one notable limitation of these techniques is the unavoidably large size of Trusted Computing Base (TCB), which can potentially break the principle of least privilege. In this work, we describe techniques that provide practical and efficient protection of security sensitive code components in legacy binary code. Our technique dissects input binaries into multiple components which are further built into SGX enclave instances. We also leverage deliberately-designed binary editing techniques to retrofit the input binary code and preserve the original program semantics. Our tentative evaluations on hardening AES encryption and decryption procedures demonstrate the practicability and efficiency of the proposed technique.

Xu, Jun, Mu, Dongliang, Chen, Ping, Xing, Xinyu, Wang, Pei, Liu, Peng. 2016. CREDAL: Towards Locating a Memory Corruption Vulnerability with Your Core Dump. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. :529–540.

After a program has crashed and terminated abnormally, it typically leaves behind a snapshot of its crashing state in the form of a core dump. While a core dump carries a large amount of information, which has long been used for software debugging, it barely serves as informative debugging aids in locating software faults, particularly memory corruption vulnerabilities. A memory corruption vulnerability is a special type of software faults that an attacker can exploit to manipulate the content at a certain memory. As such, a core dump may contain a certain amount of corrupted data, which increases the difficulty in identifying useful debugging information (e.g. , a crash point and stack traces). Without a proper mechanism to deal with this problem, a core dump can be practically useless for software failure diagnosis. In this work, we develop CREDAL, an automatic tool that employs the source code of a crashing program to enhance core dump analysis and turns a core dump to an informative aid in tracking down memory corruption vulnerabilities. Specifically, CREDAL systematically analyzes a core dump potentially corrupted and identifies the crash point and stack frames. For a core dump carrying corrupted data, it goes beyond the crash point and stack trace. In particular, CREDAL further pinpoints the variables holding corrupted data using the source code of the crashing program along with the stack frames. To assist software developers (or security analysts) in tracking down a memory corruption vulnerability, CREDAL also performs analysis and highlights the code fragments corresponding to data corruption. To demonstrate the utility of CREDAL, we use it to analyze 80 crashes corresponding to 73 memory corruption vulnerabilities archived in Offensive Security Exploit Database. We show that, CREDAL can accurately pinpoint the crash point and (fully or partially) restore a stack trace even though a crashing program stack carries corrupted data. In addition, we demonstrate CREDAL can potentially reduce the manual effort of finding the code fragment that is likely to contain memory corruption vulnerabilities.