CMU SoS Lablet Quarterly Executive Summary - October 2021
A. Fundamental Research
High level report of result or partial result that helped move security science forward-- In most cases it should point to a "hard problem". These are the most important research accomplishments of the Lablet in the previous quarter.
Jonathan Aldrich
Obsidian: A Language for Secure-by-Construction Blockchain Programs
- We continue to work on a new implimentation of Obsidian that targets the Ethereum virtual machine in cooperation with the Ethereum foundation. We are currently focusing on on allocating objects, which is a key step toward evaluating the gas costs of using Obsidian for smart contracts on Ethereum. After we are able to allocate objects in memory, we will arrange for them to be archived in storage so the ones that need to persist can do so.
- As you recall, Obsidian uses a technique called typestate to capture rules for protocols that users of objects must follow. For example, a File has to be opened before it can be read, and it can only be closed after it is first opened. We were interested in whether this kind of technique could detect or prevent bugs in real-world smart contracts. To that end, we used the Slithr static analysis framework to build a detector for protocols like the ones Obsidian's type system can capture. On a sample of 100 Solidity contracts from the SmartBugs Wild dataset, the detector identified 71 protocols in 37 contracts. A manual review found that two of these protocols were false positives, leaving 69 protocols among 37 contracts. We think it is promising that over a third of the contracts we analyzed include protocols that Obsidian's type system could capture.
- We built a detector for one known-vulnerable protocol usage pattern. In that pattern, a lockable contract can have its unlock function called when the contract is already unlocked, resulting in a state where ownership of the contract could be improperly reclaimed. Of 4,500 contracts that we ran the detector on, the detector identified 55 as potentially buggy. Manual review showed that 36 of them were vulnerable. So, although we did not identify a general-purpose mechanism for finding bugs in general, we showed that one can develop checkers for specific protocols to find real vulnerabilities, and that some known-vulnerable protocols are used repeatedly on Ethereum.
- We compared the gas costs between Obsidian and Solidity. Because the Obsidian compiler is still limited, the tests focus on low-level operations, such as arithmetic, so the results should be considered exploratory. We found that the Obsidian versions of contracts cost about 10k gas less to deploy (~ $2 US) and the Obsidian functions cost about 1k gas less to execute (~$0.20 US). We find this a little surprising, and it merits some additional investigation regarding why, but it is promising so far. It could be that the Yul optimizer, which we use, does a better job than the optimizer that solc uses when emitting EVM bytecode.
Lujo Bauer
Securing Safety-Critical Machine Learning Algorithms
We have set up experimental infrastructure and have been experimenting with using FGSM for adversarial training for malware detectors. Although using FGSM to create malware variants wouldn't produce functional malware (because the attack is not aware of the semantics of a binary), we are exploring whether it could nevertheless lead classifiers to become more robust. We have also been developing additional experimental infrastructure to use our previous, semantics-aware attacks for adversarial training of malware classifiers.
Lorrie Cranor
Characterizing user behavior and anticipating its effects on computer security with a Security Behavior Observatory
Accepted paper: What breach? Measuring online awareness of security incidents by studying real-world browsing behavior. Sruti Bhagavatula, Lujo Bauer, Apu Kapadia. To appear at EuroUSEC 2021. https://www.cs.cmu.edu/~sbhagava/papers/breach-engagement-eurousec21.pdf
- This paper utilizes SBO data to examine 1) how often people read about security incidents online, (2) whether and to what extent they then follow up and take action (2) what influences the likelihood that they will read about an incident and take some action.
Accepted paper: How Do Home Computer Users Browse the Web? Kyle Crichton, Nicolas Christin, and Lorrie Cranor. To appear in an upcoming issue of the ACM Transactions on the Web journal.
-
- Using data collected through the SBO, we provide new insights into how users browse the internet
- First, we compare our data to previous studies conducted over the past two decades and identify changes in user browsing and navigation. Most notably, we observe a substantial increase in the use of multiple browser tabs to switch between pages.
- Using the more detailed information provided by the SBO, we identify and quantify a critical measurement error inherent in previous server-side measurements that do not capture when users switch between browser tabs. This issue leads to an incomplete picture of user browsing behavior and an inaccurate measurement of user navigation and dwell time.
- In addition, we observe that users exhibit a wide range of browsing habits that do not easily cluster into different categories, a common assumption made in research study design and software development.
- We find that browsing the web consumes the majority of users' time spent on their computer eclipsing the use of all other software on their machine.
- While browsing, we show that users spend the majority of their time browsing a few popular websites, but also spend a disproportionate amount of time on low-visited websites on the edges of the internet.
- We find that users navigating to these low-visited sites are much more likely to interact with riskier content like adware, alternative health and science information, and potentially illegal streaming and gambling sites.
- Finally, we identify the primary gateways that are used to navigate to these low-visited sites and discuss the implications for future research.
David Garlan
Model-Based Explanation For Human-in-the-Loop Security
We have made progress on the following:
In addition to the above mentioned public accomplishments, we did some work on releasing a public version of the explainability framework primarily developed by Roykrong Sukkerd under this grant. The framework consists of APIs for constructing an MDP for a particular domain, as well as information about the vocabulary for how to provide contrastive explanation for the domain. The ongoing release is being done in https://github.com/cmu-able/explainable-planning.
Joshua Sunshine
Security Science Research Experience for Undergraduates
Six REU students spoke to NSA researchers and close associates on August 4 about their summer work. The following talks were given:
- picoCTF cybersecurity & education research through online gaming - Jenna Bustami, Rachel Nguyen, and Xinyue Lai
- Enhancing flexible science, technology, engineering, and mathematics thinking by generating interactive diagrams at scale - Hwei-Shin Harriman
- Data Structures for Distributed, Federated Machine Learning - James Flemmin
- Privacy-Preserving Deep Learning - Allen Marquez, California State University, Los Angeles, Steven Wu
- Polymorphic Memory Hierarchy - Jennifer Seibert, SUNY Binghamton, Nathan Beckmann
- User Awareness of Social Media Algorithms - Maya De Los Santos, Northeastern University, Daniel Klug
- All REU students presented their work at a culminating poster session on August 5 to the enitre Carnegie Mellon community. More than two hundred faculty, staff, and students attended the session.
- Adam Taggert spoke in mid-July to 20 REU students about careers at NSA in research.