Biblio
Web browsers are among the most important but also complex software solutions to access the web. It is therefore not surprising that web browsers are an attractive target for attackers. Especially in the last decade, security researchers and browser vendors have developed sandboxing mechanisms like security-relevant HTTP headers to tackle the problem of getting a more secure browser. Although the security community is aware of the importance of security-relevant HTTP headers, legacy applications and individual requests from different parties have led to possible insecure configurations of these headers. Even if specific security headers are configured correctly, conflicts in their functionalities may lead to unforeseen browser behaviors and vulnerabilities. Recently, the first work which analyzed duplicated headers and conflicts in headers was published by Calzavara et al. at USENIX Security [1]. The authors focused on inconsistent protections by using both, the HTTP header X-Frame-Options and the framing protection of the Content-Security-Policy.We extend their work by analyzing browser behaviors when parsing duplicated headers, conflicting directives, and values that do not conform to the defined ABNF metalanguage specification. We created an open-source testbed running over 19,800 test cases, at which nearly 300 test cases are executed in the set of 66 different browsers. Our work shows that browsers conform to the specification and behave securely. However, all tested browsers behave differently when it comes, for example, to parsing the Strict-Transport-Security header. Moreover, Chrome, Safari, and Firefox behave differently if the header contains a character, which is not allowed by the defined ABNF. This results in the protection mechanism being fully enforced, partially enforced, or not enforced and thus completely bypassable.
ISSN: 2770-8411
Current algorithms for context-free parsing inflict a trade-off between ease of understanding, ease of implementation, theoretical complexity, and practical performance. No algorithm achieves all of these properties simultaneously. Might et al. introduced parsing with derivatives, which handles arbitrary context-free grammars while being both easy to understand and simple to implement. Despite much initial enthusiasm and a multitude of independent implementations, its worst-case complexity has never been proven to be better than exponential. In fact, high-level arguments claiming it is fundamentally exponential have been advanced and even accepted as part of the folklore. Performance ended up being sluggish in practice, and this sluggishness was taken as informal evidence of exponentiality. In this paper, we reexamine the performance of parsing with derivatives. We have discovered that it is not exponential but, in fact, cubic. Moreover, simple (though perhaps not obvious) modifications to the implementation by Might et al. lead to an implementation that is not only easy to understand but also highly performant in practice.
Modern static bug finding tools are complex. They typically consist of hundreds of thousands of lines of code, and most of them are wedded to one language (or even one compiler). This complexity makes the systems hard to understand, hard to debug, and hard to retarget to new languages, thereby dramatically limiting their scope. This paper reduces checking system complexity by addressing a fundamental assumption, the assumption that checkers must depend on a full-blown language specification and compiler front end. Instead, our program checkers are based on drastically incomplete language grammars ("micro-grammars") that describe only portions of a language relevant to a checker. As a result, our implementation is tiny-roughly 2500 lines of code, about two orders of magnitude smaller than a typical system. We hope that this dramatic increase in simplicity will allow people to use more checkers on more systems in more languages. We implement our approach in μchex, a language-agnostic framework for writing static bug checkers. We use it to build micro-grammar based checkers for six languages (C, the C preprocessor, C++, Java, JavaScript, and Dart) and find over 700 errors in real-world projects.
Programming languages often include specialized syntax for common datatypes e.g. lists and some also build in support for specific specialized datatypes e.g. regular expressions, but user-defined types must use general-purpose syntax. Frustration with this causes developers to use strings, rather than structured data, with alarming frequency, leading to correctness, performance, security, and usability issues. Allowing library providers to modularly extend a language with new syntax could help address these issues. Unfortunately, prior mechanisms either limit expressiveness or are not safely composable: individually unambiguous extensions can still cause ambiguities when used together. We introduce type-specific languages TSLs: logic associated with a type that determines how the bodies of generic literals, able to contain arbitrary syntax, are parsed and elaborated, hygienically. The TSL for a type is invoked only when a literal appears where a term of that type is expected, guaranteeing non-interference. We give evidence supporting the applicability of this approach and formally specify it with a bidirectionally typed elaboration semantics for the Wyvern programming language.