Summary: Spring 2023 SoS Quarterly Lablet Meeting
Spring 2023 SoS Quarterly Lablet Meeting
The Spring 2023 Science of Security and Privacy (SoS) Quarterly Lablet meeting was hosted virtually and in person by North Carolina State University (NCSU) on March 8-9, 2023. The attendees from the government and six SoS Lablets were welcomed by Laurie Williams (NCSU), Munindar Singh (NCSU), and Adam Tagert, the NSA SoS Technical Director. The purpose of the final Quarterly meeting was to enable the Lablets to present the results of the projects they had been working on for the past five years. During the overview, Adam stated how he was excited to have one last Quarterly Lablet meeting and mentioned how great it has been to see the results of the projects over the years.
Invited Talks
Enhancing the Security Posture of IoT: Study of Remote Attestation at the "Deep Edge"
Chris Meier, NSA
The research problem was to identify potentially malicious behavior in embedded IoT devices and the researchers were looking to improve network slices security and the massive IoT slice for 5G. The researchers wanted to understand how to use new trust mechanisms in low-power edge devices to enhance network security. The researchers also wanted to employ and add to the ARM ecosystem remote attestation in IoT Edge Devising using ARM Trustzone. The researchers looked at the attestor-relying party interaction model. For Edge Device Attestation, the attestation procedure starts when PSA-standard claims are generated by the bootloader at boot and stored in secure memory; then the attestation client requests a PSA token from the attestation TA via API call; then the Attestation TA returns the PSA token to the attestation. For verification, the researchers used Veraison. The verification procedure starts when endorsements and trust anchors are loaded via the provisions service at creation time, and then the attestor submits the PSA token to the verification service. In the future, the researchers plan to look at anomaly detection at the Edge with tinyML.
Trust and the Use of AI to Develop Software
Matt Fussa, Cisco
One important question one has to look at regarding AI, especially as it is used more, is who owns and controls the output of an AI generator. For example, if you use an AI generator to create code for software, who really owns that software? The speaker noted that one thing that AI does not do well is identify the source so that a person can understand where it came from. Because open AI it is not perfect, people should not put their complete trust in what it produces. For example, suppose AI generates vulnerable code, and a person doesn't review it or make corrections before it moves up the development ladder. In that case, it can make the software later down the road vulnerable to cyberattacks. Using open AI also creates licensing risks. He noted that we are just scratching the surface of what is trustworthy AI. Some principles of trust include: reliable functionality, human oversight of AI outputs, foundational security in AI infrastructure, high levels of transparency, and clear ownership and IP rights. He concluded by noting that many discussions about AI revolve around bias, but researchers will have to look at so much more than that.
Project Presentations
Carnegie Mellon University (CMU)
1. Characterizing User Behavior and Anticipating its Effects on Computer Security with a Security Behavior Observatory
Kyle Crichton, CMU
The goal of this research was to characterize home computer users' computer use and online behavior choices that impact security and privacy. This work can be used to develop models and technologies to be targeted to realistic situations. The project's research areas included: user engagement and security; password creation and use; private browsing usage; web browsing behavior; and exposure to malicious content. This research addresses the hard problem of "Understanding and Accounting for Human Behavior" by collecting data directly from people's own home computers, thereby capturing people's computing behavior "in the wild." The researchers noted that this data is the closest to the ground truth of the users' everyday security and privacy challenges. The researchers found that low-visited sites have a higher likelihood of containing risky or malicious content and that 89% of malware and phishing websites are located in the periphery of the internet. The three main gateways to the periphery included search engines (26.5%), other periphery sites (24.5%), and mid-tier websites (49%).
2. Model-Based Explanation for Human-in-the-Loop Security
David Garlan, CMU
An effective response to cyberattacks often requires a combination of both automated and human-mediated actions. The researchers noted that we currently lack adequate methods to reason about such human-system coordination, including ways to determine when to allocate tasks to each party and how to gain assurance that automated mechanisms are appropriately aligned with organizational needs and policies. This project focuses on combining human and automated actions in response to cyberattacks. The researchers noted that they would like the system to understand more about humans and would like more humans to understand more about systems. The researchers stated that it is important to improve confidence through explanations that happen at a level where humans can detect problems and fix the system. The researchers said that formal models for planning can be used as the basis of human-understandable explanations. In one study, the control group got a scenario, the robot's plan, and a cost profile. They were told what the robot was planning to do. They were also told about the consequences of certain paths, and then the humans had to decide whether this was the best option or not and how confident they were with their decision. The researcher hypothesized that participants who receive the explanations are more likely to correctly determine whether the robot's plans align with their preferences and that they would be more confident in their judgment. The researchers found that the participants who received the explanations were 3.8 times more likely to be correct and confident in their decision than those who did not.
3. Obsidian: A Language for Secure-By-Construction Blockchain Programs
Michael Coblenz, CMU
Blockchains have been proposed to support transactions on distributed, shared state, but hackers have exploited security vulnerabilities in existing programs. The researchers applied user-centered design in the creation of Obsidian, a new language that uses typestate and linearity to support stronger safety guarantees than current approaches for programming blockchain systems. The researchers compared Obsidian to Solidity. The researchers noted that Obsidian code is a little shorter and that Solidity does explicit, verbose state checks. In a study, the researchers gave Java script developers a 90-minute tutorial in both languages and then had them create an auction website using Obsidian and Solidity. Many of the participants were done in 30 minutes. Many more of the participants were able to submit correct code to create the auction site using Obsidian (7 of 8 correct) than Solidity (2 of 8 correct). The researchers concluded that Obsidian is easier to use than other codes and is also safer.
4. Securing Safety-Critical Machine Learning Algorithms
Lujo Bauer, CMU
The goal of this project was to understand the threat against practical uses of ML and how to defend them. One question the researchers wanted to answer was, can an attacker fool ML classifiers? The researchers created adversarial eyeglasses with the goal that an attack can be physically realized and an attack can be inconspicuous. In step 1, they generated realistic eyeglasses. In step 2, the researchers generated realistic adversarial eyeglasses. The researchers also looked at hypothetical attacks on malware detection. The researchers stated that malware detectors in the future should be trained on both adversarial and normal malware. The researchers concluded that real-world applications of ML are vulnerable to evasion attacks, and some defenses can be adapted to mitigate attacks. The researchers noted that many open questions remain: Do attacks/defenses generalize? What defenses are good enough? How can one make ML models more transparent?
International Computer Science Institute (ICSI)
1. Contextual Integrity for Computer Systems
Michael Tschantz, ICSI
Contextual Integrity (CI) is a philosophical theory of privacy; privacy is the flow of information in accordance to the legitimate norms the context. The researchers compared privacy risk assessments compared to CI. The researchers found that they both operate at multiple levels. The arrive at different questions, which suggests that one could try to hybridize the two. They both require subjective judgments and utilitarian-like thinking figures into both. The researchers concluded that privacy risk assessments are familiar but may fall into corporate logic, and differential privacy is a great tool but not normative itself.
2. Designing for Privacy
Julia Bernd, ICSI
Design interventions for privacy can occur at a lot of stages and levels, and the goal of this project was to develop a new toolbox of techniques and help designers understand when best to apply tools. The researchers conducted a literature review of work on privacy and design at HCI conferences. They developed a model that categorizes the factors that affect developers' decision-making: from the environmental level (context outside of organization or team) to organizational, product, development process-related levels, and finally, personal levels. The researchers in future work want to empirically validate the model and quantify the relative impacts. With students at UC Berkeley, the researchers also evaluated the familiarity of smartphone users with privacy and security settings, their expectations about their ability to configure those settings, and their understanding of the privacy and security threats against which the settings are supposed to protect them. The researchers found that many people were unaware of smartphone privacy/security settings and their defaults, and had not configured them in the past, though they expressed willingness to do it in the future. The researchers also conducted cognitive walkthrough interviews with a demographically diverse sample of iOS and Android users. They asked the participants to talk through configuring three smartphone privacy settings, then asked follow-up questions about difficulty and clarity.
3. Governance for Big Data
Serge Egelman, ICSI
The researchers noted that the project started by looking at how organizations are adopting frameworks for big data governance. The researchers then pivoted to examining how organizations are dealing with privacy and security issues with the data collection they do. They surveyed 120 app developers of child-directed apps. Research questions the researchers wanted to answer include are the developers aware of relevant privacy laws? Are they aware of their own app's privacy behaviors, and how do the developers check their apps for compliance with COPPA (Children's Online Privacy Protection Rule)? The researchers found that around 80% of the developers knew about COPPA/GDPR (General Data Protection Regulation). However, only 50% said they have processes for addressing the laws. A quarter of the participants stated that they were aware of a law called FLIRPPA--which is a law that doesn't really exist. The researchers also found that the reason why developers don't get consent for data collecting is that some said they put it in their policy, but many thought Google was checking their compliance before the app was published. Regarding app behaviors, half of the survey participants did not know if the data sent was encrypted. Almost three-quarters (72%) claimed they were confident that their apps did not send data to third parties, but more than a quarter of them were wrong. The researchers noted that SDKs (Software Development Kits) are not looked at by developers very closely. The researchers concluded that developers need to vet the SDKs they use and that platforms need to provide better guidance to developers.
4. Operationalizing Contextual Integrity
Serge Egelman, ICSI
The ultimate goal is to design new privacy controls that are grounded in the theory of contextual integrity so that they can automatically infer contextual norms and handle data-sharing and disclosure on a per-use basis. During the study, the researchers focused on in-home listening devices. The researchers found the topics that devices should have discretion about collecting include personal/sensitive queries, information about children, financial topics, background noise, personal identifiers, and more. The researchers surveyed 200 online participants in order to discover if privacy controls impact adoption. The researchers found that people don't care whether audio or transcripts are captured by in-home listening devices. They care about how data is used, where data is processed, and whether a human hears the audio or reads the transcript. People do not want others to review the recordings or transcripts, but are more agreeable if a computer does. In-home listening device users' expectations do not match reality.
5. Scalable Privacy Analysis
Primal Wijesekera, ICSI
The researchers constructed a toolchain that allows them to automatically perform dynamic analysis on mobile apps to monitor what sensitive personal information they attempt to access, and then to whom they transmit it. This is allowing the researchers to perform large-scale studies of the privacy behaviors of the mobile app ecosystem, as well as devise new methods of protecting user privacy. During one project, the researchers found that consumers expect paid apps to have better security and privacy behaviors. However, that isn't usually the case. In an ongoing project, the researchers look at health app compliance with HIPPA guidelines. In another project, the researchers are also looking at anti-analysis techniques that app developers are deploying. One question they want to answer is whether app developers are deploying anti-analysis techniques because they have legitimate concerns or if it is for malicious intent. The researchers also want to determine how anti-analysis techniques affect the behavior of an app.
North Carolina State University (NCSU)
1. Coordinated Machine Learning-Based Vulnerability and Security Patching for Resilient Virtual Computing Infrastructure
Helen Gu, NCSU
This research aimed to assist administrators of virtualized computing infrastructures in making services more resilient to security attacks. This is done through applying machine learning to reduce both security and functionality risks in software patching by continually monitoring patched and unpatched software to discover vulnerabilities and triggering proper security updates. The first project looked at Self-patch (self-triggering patching framework), which performs on-demand attack detection and patching of containerized apps. Self-patch had an 81% detection rate with a 0.72% false positive rate. Self-patch also reduces memory and disk overhead. The second project looked at Classified Distributed Learning (CDL), which improves attack detection accuracy with application-aware modeling and attack detection. CDL reduces the false positive rate by 98% and increases the true positive rate by 66%. The third project looked at Self-supervised Hybrid Learning (SHIL), which combines unsupervised and supervised learning methods to improve detection accuracy. SHIL, compared with pure unsupervised models, reduces the false positive rate by 39-40% and increases the detection rate by 4-12%. SHIL, compared with pure supervised models, reduces the false positive rate by 92-91% and increases the detection rate by 9-19%.
2. Development of Methodology Guidelines for Security Research
Matthew Armstrong, University of Alabama (NCSU Sub-Lablet)
The goal of this project was to aid the security research community in conducting and reporting methodologically sound science through development, refinement, and use of community-based security research guidelines. The researchers proposed the characterization of the security literature based upon those guidelines. The researchers developed an initial set of guidelines to aid reporting security research and are currently developing a new iteration of guidelines with external input. For future work, the researchers want to establish a need for a reporting guideline framework that would organize and support reporting across sub-domains of security research.
3. Predicting the Difficulty of Compromise through How Attackers Discover Vulnerabilities
Andy Meneely, Rochester Institute of Technology (NCSU Sub-Lablet)
The project's goal is to provide actionable feedback on the discoverability of a vulnerability. This feedback is useful for in-process software risk assessment, incident response, and the vulnerabilities equities process. The researcher's approach is to combine the attack surface metaphor and attacker behavior to estimate how attackers will approach discovering a vulnerability. The researchers want to develop metrics that are useful and improve the metric formulation based on qualitative and quantitative feedback. The researchers stated that modern software relies on third-party open-source packages as dependencies. While using open-source may be free, users must ensure their dependencies are secure. The researchers conducted four studies: Study 1 was a comparative study of coverability reporting by software composition analysis tools; study 2 looked at investigating security releases of open-source packages; study 3 revolved around measuring code review coverage in dependency updates; and study 4 worked on creating a social network-based ranking of developers in the Rust package ecosystem. The researchers noted that there is no formal process to assess human error in software engineering.
4. Reasoning about Accidental and Malicious Misuse via Formal Methods
Munindar Singh, NCSU
This project seeks to aid security analysts in identifying and protecting against accidental and malicious actions by users or software through automated reasoning on unified representations of user expectations and software implementations to identify misuses sensitive to usage and machine context. During the project, the researchers used iRogue to identify rogue apps through their reviews. iRogue found 239 rogue apps, with 77% recall at 76% F1 score. Corba and Ember were used to extract remedial actions from breach reports. Ember produces suggestions of remedial actions based on a breach report. Caspar and Scheture were used for extracting app problems and user actions. Cardpliance was used to look at PCI (Payment Card Industry) compliance of Android applications. The researchers ran 6 PCI checks on a dataset of 368 applications and found 15 PCI violations across 6 applications. AARDroid was used for the analysis of payment service provider SDKs in Android. The researchers studied 50 Android payment SDKs and found that 37 of the SDKs failed to implement basic security requirements, and 12 SDKs failed to implement one advanced security requirement. The researchers also analyzed adversarial techniques in APT attacks based on CTI reports. Regarding outreach, the project was worked on by five Ph.D. students, two undergraduates, and one student trying to achieve their master's.
University of Illinois at Urbana-Champaign (UIUC)
1. An Automated Synthesis Framework for Network Security and Resilience
Kevin Jin, University of Arkansas (UIUC Sub-Lablet)
The researchers aimed to develop the analysis methodology needed to support scientific reasoning about the resilience and security of networks, with a particular focus on network control and information/data flow. There were 3 research tasks: In task 1, the researchers looked at network control synthesis and developed algorithms/systems that perform automated synthesis to enhance network security and resilience; during task 2, they looked at network software analysis and modeling and developed frameworks for writing secure network control programs; in task 3, the researchers looked at resilient and self-healing network architecture and applications. The project achievements include more than 30 papers and helped create four new cybersecurity courses.
2. Resilient Control of CPSs with Distributed Learning
Negin Musavi, UIUC
The researchers stated that typical formal verification/safety analysis approaches use strong assumptions like full knowledge of the model and perfect state observability. The researchers looked at hybrid hierarchal optimistic optimization with the motivation of doing model checking safe autonomy at a traffic roundabout. Their approach relies on multi-armed bandits from machine learning literature. They solved the model checking problem using hybrid armed-bandits. The researchers constructed a super tree over the state-space using Hybrid Hierarchical Optimistic Optimization (HyHOO) Algorithm and sampled the efficiency of HyHOO. The researchers compared HyHOO with BoTorch and found that HYHOO achieves a better performance than BoTorch. Regarding community outreach, the researchers created a summer camp in 2022 where 12 high schoolers learned the basics of signal processing, planning vision, and control algorithms. The camp attendees also developed and employed code on an autonomous vehicle platform.
3. Uncertainty in Security Analysis
David Nicol, University of Illinois at Urbana-Champaign
The goal of this project was to develop a mathematical basis for describing and analyzing the ability of an adversary to laterally traverse networks in the presence of uncertainty about connections and uncertainty about exploitable vulnerabilities. The researchers used this basis to develop algorithms for quantified risk analysis of cyber-physical systems. This research made contributions in modeling uncertainty inherent in the analysis of attack graphs and helped extend approaches in modeling the reliability of networks.
University of Kansas (KU)
1. Flexible Mechanisms for Remote Attestation
Perry Alexander, KU
Remote attestation has enormous potential for establishing trust in highly distributed IoT and cyber-physical systems. However, significant work remains to define a unifying semantics of remote attestation that precisely defines guarantees that scales to large, heterogeneous systems. This research aims to create a science of trust and remote attestation, and prototype infrastructure for building remote attestation systems. A new approach is lifecycle attestation: manifests define systems in context, manifests can be related, manifests can be synthesized to implantations, and manifests are now formal structures. Some open hard questions that the researchers would like to continue to try to answer with their research include: What is good evidence? How do we gather evidence? How long does evidence have utility? How do we compose evidence? How does evidence relate to adversary behavior? And the researchers want to look at attestation over system lifecycle.
2. Ontology and Epistemology of Resilience
John Symons, KU
The researchers noted that community resilience is loosely defined as the capacity to withstand or recover from adverse events and said that typically resilience focuses on engineered infrastructure but neglects cultural and normative factors. The researchers noted that intuitively it is well known that resilience matters, but currently, there is no good explanation of why and how some institutions are more and less resilient. The researchers focused on social norms in order to sketch strategic and practical capacities to understand and defend against social attacks. Instead of inferring social consequences from phycological operations at scale, the researchers analyzed efforts to undermine norms. For future work, the researchers are interested in understanding the social attack surface both strategically and tactically. The researchers noted that the social attack surface has been studied within the framework of cyber espionage, but less attention has been given to its role in cyber warfare. The researchers stated that a norm is resilient if a sufficient number of people believe that the empirical expectation holds and if they follow the normative expectation. The researchers noted that attacking the norms involves undermining empirical expectations or undermining normative expectations
3. Secure Native Binary Executions
Prasad Kulkarni (KU)
The researchers revealed that software binaries are shipped without any security metrics and that no good and accessible tools are around to retroactively secure binary software. The project's goals were to develop technology to rate the security level of any arbitrary software binary and to develop tools to retroactively secure software binaries with indicators of effectiveness and overhead. In their first project, they worked on rating the security level of software binaries with the goals of detecting potential vulnerabilities and coding weaknesses/flaws in binaries, detecting run-time security checks in binaries, and creating an approach to compute the security level of binaries. In their second study, the researchers wanted to study challenges and develop techniques to secure program binaries. The researchers wanted to determine the source programming language from binary and study the effectiveness and efficiency of binary-level analysis and security techniques. The researchers noted that much of the work is ongoing. This study allowed for many interesting discoveries and built new algorithms and techniques, funded three current Ph.D. students, and allowed the development of a new class in software reverse engineering.
3. Side-Channel Attack Resistance
Heechul Yun, KU
The researchers noted that modern systems introduce new challenges in safety and security and stated that computing time is important for cyber-physical systems. The goal of the research was to build safe and secure computing infrastructure for the next generation of intelligent cyber-physical systems. The researchers wanted to make it so that a cyberattack cannot happen. The researchers used RT-Gang and found that while using this, a cyberattack cannot affect the executing time of critical tasks. The researchers found that deterministic memory is a data-centric cross-layer approach for real-time. The researchers used SpectreGuard, which is a data-centric cross-layer approach for security, as well as SpectreRewind, which had high performance and low noise. The researchers also looked at RISC-V + NVDLA SoC platform and said that this open-source hardware is a big research opportunity. The impact of their research included many published papers, and the researchers created an open-source platform. Bosch is currently using the technology to protect its systems. The researchers developed an AI summer camp for rural high school students and educators in June 2022, and created classes based on their research.
Vanderbilt University
1. Foundations of CPS Resilience
Xenofon Koutsoukos, Vanderbilt
The goal of this project was to develop a systematic body of knowledge with strong theoretical and empirical underpinnings to inform the engineering of secure and resilient CPS that can resist not only known but also unanticipated attacks. The researchers looked at resilient distributed learning in multi-agent systems and resilient distributed multi-tasking learning. The researchers also looked at exploiting EM side channel information and found that one can start integrating cyber-physical models.
2. Mixed Initiative and Collaborative Learning in Adversarial Environments
Claire Tomlin, University of California, Berkeley (Vanderbilt Sub-Lablet)
One of the goals of the research was to characterize the limiting behavior of machine learning algorithms deployed in competitive settings. This research project focused on a game theoretic approach to learning dynamic behavior safely through reachable sets, probabilistically safe planning around people, and safe policy gradient reinforcement learning. An understanding of the behaviors (convergence, optimality, etc.) of these algorithms in such settings is sorely lacking. The research team looked at disturbance (attempts to force system into unsafe region) and control (attempts to stay safe) as well as fundamental issues with gradient play in games, since machine learning algorithms are increasingly being implemented in competitive settings. The researchers have been developing a methodology to take into effect feedback methodology and looked at multi-hypothesis interactions.
3. Multi-Model Test Bed for the Simulation-Based Evaluation of Resilience
Peter Volgyesi, Vanderbilt
The goal of the Multi-model Testbed is to provide a collaborative design tool for evaluating various cyberattack/defense strategies and their effects on the physical infrastructure. The web-based, cloud-hosted environment integrates state-of-the-art simulation engines for the different CPS domains and presents interesting research challenges as ready-to-use scenarios. Input data, model parameters, and simulation results are archived and versioned with a strong emphasis on repeatability and provenance. Cyberattacks were simulated against traffic networks, railway systems, and power/smart grids. Outreach included collaboration with NIST looking at cyberattacks against railway systems, and having a high school cybersecurity boot camp where students learned about common cybersecurity problems and mitigation techniques.
4. Policy Analytics for Cybersecurity of Cyber-Physical Systems
Nazli Choucri, Massachusetts Institute of Technology (Vanderbilt Sub-Lablet)
Cybersecurity policies and guidelines are in text form, and text imposes a powerful linearity and creates distortions. The project aims to introduce analytics for cybersecurity policy of cyber-physical systems in order to produce tools for analytics of cybersecurity policy. The researchers also want to reduce barriers to policy, enhance values of directives, and provide proof of concept and validation with use cases. The project uses the NIST Cyber Security Framework applied to a smart grid as a testbed.