Biblio
In this paper we explore the differential perceptions of cybersecurity professionals and general users regarding access rules and passwords. We conducted a preliminary survey involving 28 participants: 15 cybersecurity professionasl and 13 general users. We present our preliminary findings and explain how such survey data might be used to improve security in practice. We focus on user fatigue with access rules and passwords.
Presented at the NSA Science of Security Quarterly Meeting, July 2016.
We present a technique for bounded invariant verification of nonlinear networked dynamical systems with delayed interconnections. The underlying problem in precise boundedtime verification lies with computing bounds on the sensitivity of trajectories (or solutions) to changes in initial states and inputs of the system. For large networks, computing this sensitivity
with precision guarantees is challenging. We introduce the notion of input-to-state (IS) discrepancy of each module or subsystem in a larger nonlinear networked dynamical system. The IS discrepancy bounds the distance between two solutions or trajectories of a module in terms of their initial states and their inputs. Given the IS discrepancy functions of the modules, we show that it is possible to effectively construct a reduced (low dimensional) time-delayed dynamical system, such that the trajectory of this reduced model precisely bounds the distance between the trajectories of the complete network with changed initial states. Using the above results we develop a sound and relatively complete algorithm for bounded invariant verification of networked dynamical systems consisting of nonlinear modules interacting through possibly delayed signals. Finally, we introduce a local version of IS discrepancy and show that it is possible to compute them using only the Lipschitz constant and the Jacobian of the dynamic function of the modules.
Best Poster Award, Illinois Institute of Technology Research Day, April 11, 2016.
While there have been various studies identifying and classifying Android malware, there is limited discussion of the broader class of apps that fall in a gray area. Mobile grayware is distinct from PC grayware due to differences in operating system properties. Due to mobile grayware’s subjective nature, it is difficult to identify mobile grayware via program analysis alone. Instead, we hypothesize enhancing analysis with text analytics can effectively reduce human effort when triaging grayware. In this paper, we design and implement heuristics for seven main categories of grayware.We then use these heuristics to simulate grayware triage on a large set of apps from Google Play. We then present the results of our empirical study, demonstrating a clear problem of grayware. In doing so, we show how even relatively simple heuristics can quickly triage apps that take advantage of users in an undesirable way.
In recent years, online programming and software engineering education via information technology has gained a lot of popularity. Typically, popular courses often have hundreds or thousands of students but only a few course sta members. Tool automation is needed to maintain the quality of education. In this paper, we envision that the capability of quantifying behavioral similarity between programs is helpful for teaching and learning programming and software engineering, and propose three metrics that approximate the computation of behavioral similarity. Speci cally, we leverage random testing and dynamic symbolic execution (DSE) to generate test inputs, and run programs on these test inputs to compute metric values of the behavioral similarity. We evaluate our metrics on three real-world data sets from the Pex4Fun platform (which so far has accumulated more than 1.7 million game-play interactions). The results show that our metrics provide highly accurate approximation to the behavioral similarity. We also demonstrate a number of practical applications of our metrics including hint generation, progress indication, and automatic grading.
Anonymous messaging platforms like Whisper and Yik Yak allow users to spread messages over a network (e.g., a social network) without revealing message authorship to other users. The spread of messages on these platforms can be modeled by a diffusion process over a graph. Recent advances in network analysis have revealed that such diffusion processes are vulnerable to author deanonymization by adversaries with access to metadata, such as timing information. In this work, we ask the fundamental question of how to propagate anonymous messages over a graph to make it difficult for adversaries to infer the source. In particular, we study the performance of a message propagation protocol called adaptive diffusion introduced in (Fanti et al., 2015). We prove that when the adversary has access to metadata at a fraction of corrupted graph nodes, adaptive diffusion achieves asymptotically optimal source-hiding and significantly outperforms standard diffusion. We further demonstrate empirically that adaptive diffusion hides the source effectively on real social networks.
SDN’s logically centralized control provides an insertion point for programming the network. While it is generally agreed that higherlevel abstractions are needed to make that programming easy, there is little consensus on what are the “right” abstractions. Indeed, as SDN moves beyond its initial specialized deployments to broader use cases, it is likely that network control applications will require diverse abstractions that evolve over time. To this end, we champion a perspective that SDN control fundamentally revolves around data representation. We discard any application-specific structure that might be outgrown by new demands. Instead, we adopt a plain data representation of the entire network — network topology, forwarding, and control applications — and seek a universal data language that allows application programmers to transform the primitive representation into any high-level representations presented to applications or network operators. Driven by this insight, we present a system, Ravel, that implements an entire SDN network control infrastructure within a standard SQL database. In Ravel, network abstractions take the form of user-defined SQL views expressed by SQL queries that can be added on the fly. A key challenge in realizing this approach is to orchestrate multiple simultaneous abstractions that collectively affect the same underlying data. To achieve this, Ravel enhances the database with novel data integration mechanisms that merge the multiple views into a coherent forwarding behavior. Moreover, Ravel is exposed to applications through the one simple, familiar and highly interoperable SQL interface. While this is an ambitious long-term goal, our prototype built on the PostgreSQL database exhibits promising performance even for large scale networks.
The emerging software-defined networking (SDN) technology decouples the control plane from the data plane in a computer network with open and standardized interfaces, and hence opens up the network designers’ options and ability to innovate. The wide adoption of SDN in industry has motivated the development of large-scale, high-fidelity testbeds for evaluation of systems that incorporate SDN. In this article, we develop a framework to support OpenFlow-based SDN simulation and distributed emulation, by leveraging our prior work on a hybrid network testbed with a parallel network simulator and a virtual-machine-based emulation system. We show how to exploit typical SDN controller behaviors to handle performance issues caused by the centralized controller in parallel discrete-event simulation. In particular, we develop an asynchronous synchronization algorithm for passive SDN controllers and design a two-level architecture for active SDN controllers. We evaluate the system performance, showing good scalability. Finally, we present a case study, using the testbed, to evaluate network verification applications in an SDN-based data center network. CCS Concepts: Networks→Network simulations; Computing methodologies→Simulation
This paper presents a Factor Graph based framework called AttackTagger for highly accurate and preemptive detection of attacks, i.e., before the system misuse. We use secu- rity logs on real incidents that occurred over a six-year pe- riod at the National Center for Supercomputing Applica- tions (NCSA) to evaluate AttackTagger. Our data consist of security incidents that led to compromise of the target system, i.e., the attacks in the incidents were only identified after the fact by security analysts. AttackTagger detected 74 percent of attacks, and the majority them were detected before the system misuse. Finally, AttackTagger uncovered six hidden attacks that were not detected by intrusion de- tection systems during the incidents or by security analysts in post-incident forensic analysis.
In distributed optimization and iterative consensus literature, a standard problem is for N agents to minimize a function f over a subset of Rn, where the cost function is expressed as Σ fi . In this paper, we study the private distributed optimization (PDOP) problem with the additional requirement that the cost function of the individual agents should remain differentially private. The adversary attempts to infer information about the private cost functions from the messages that the agents exchange. Achieving differential privacy requires that any change of an individual’s cost function only results in unsubstantial changes in the statistics of the messages. We propose a class of iterative algorithms for solving PDOP, which achieves differential privacy and convergence to the optimal value. Our analysis reveals the dependence of the achieved accuracy and the privacy levels on the the parameters of the algorithm.
The concept of differential privacy stems from the study of private query of datasets. In this work, we apply this concept to metric spaces to study a mechanism that randomizes a deterministic query by adding mean-zero noise to keep differential privacy.
Design and testing of pacemaker is challenging because of the need to capture the interaction between the physical processes (e.g. voltage signal in cardiac tissue) and the embedded software (e.g. a pacemaker). At the same time, there is a growing need for design and certification methodologies that can provide quality assurance for the embedded software. We describe recent progress in simulation-based techniques that are capable of ensuring guaranteed coverage. Our methods employ discrep- ancy functions, which impose bounds on system dynamics, and proceed through iteratively constructing over-approximations of the reachable set of states. We are able to prove time bounded safety or produce counterexamples. We illustrate the techniques by analyzing a family of pacemaker designs against time duration requirements and synthesize safe parameter ranges. We conclude by outlining the potential uses of this technology to improve the safety of medical device designs.
We present a controller synthesis algorithm for a discrete time reach-avoid problem in the presence of adversaries. Our model of the adversary captures typical malicious attacks en- visioned on cyber-physical systems such as sensor spoofing, controller corruption, and actuator intrusion. After formu- lating the problem in a general setting, we present a sound and complete algorithm for the case with linear dynamics and an adversary with a budget on the total L2-norm of its actions. The algorithm relies on a result from linear control theory that enables us to decompose and precisely compute the reachable states of the system in terms of a symbolic simulation of the adversary-free dynamics and the total uncertainty induced by the adversary. With this de- composition, the synthesis problem eliminates the universal quantifier on the adversary’s choices and the symbolic con- troller actions can be effectively solved using an SMT solver. The constraints induced by the adversary are computed by solving second-order cone programmings. The algorithm is later extended to synthesize state-dependent controller and to generate attacks for the adversary. We present prelimi- nary experimental results that show the effectiveness of this approach on several example problems.
Applications in mobile marketplaces may leak private user information without notification. Existing mobile platforms provide little information on how applications use private user data, making it difficult for experts to validate appli- cations and for users to grant applications access to their private data. We propose a user-aware-privacy-control approach, which reveals how private information is used inside applications. We compute static information flows and classify them as safe/un- safe based on a tamper analysis that tracks whether private data is obscured before escaping through output channels. This flow information enables platforms to provide default settings that expose private data for only safe flows, thereby preserving privacy and minimizing decisions required from users. We build our approach into TouchDe- velop, an application-creation environment that allows users to write scripts on mobile devices and install scripts published by other users. We evaluate our approach by studying 546 scripts published by 194 users, and the results show that our approach effectively reduces the need to make access-granting choices to only 10.1 % (54) of all scripts. We also conduct a user survey that involves 50 TouchDevelop users to assess the effectiveness and usability of our approach. The results show that 90 % of the users consider our approach useful in protecting their privacy, and 54 % prefer our approach over other privacy-control approaches.
Mobile malware attempts to evade detection during app analysis by mimicking security-sensitive behaviors of benign apps that provide similar functionality (e.g., sending SMS mes- sages), and suppressing their payload to reduce the chance of being observed (e.g., executing only its payload at night). Since current approaches focus their analyses on the types of security- sensitive resources being accessed (e.g., network), these evasive techniques in malware make differentiating between malicious and benign app behaviors a difficult task during app analysis. We propose that the malicious and benign behaviors within apps can be differentiated based on the contexts that trigger security- sensitive behaviors, i.e., the events and conditions that cause the security-sensitive behaviors to occur. In this work, we introduce AppContext, an approach of static program analysis that extracts the contexts of security-sensitive behaviors to assist app analysis in differentiating between malicious and benign behaviors. We implement a prototype of AppContext and evaluate AppContext on 202 malicious apps from various malware datasets, and 633 benign apps from the Google Play Store. AppContext correctly identifies 192 malicious apps with 87.7% precision and 95% recall. Our evaluation results suggest that the maliciousness of a security-sensitive behavior is more closely related to the intention of the behavior (reflected via contexts) than the type of the security-sensitive resources that the behavior accesses.
Sophistication and flexibility of software development make it easy to leave security vulnerabilities in software applications for attack- ers. It is critical to educate and train software engineers to avoid in- troducing vulnerabilities in software applications in the first place such as adopting secure coding mechanisms and conducting secu- rity testing. A number of websites provide training grounds to train people’s hacking skills, which are highly related to security test- ing skills, and train people’s secure coding skills. However, there exists no interactive gaming platform for instilling gaming aspects into the education and training of secure coding. To address this issue, we propose to construct secure coding duels in Code Hunt, a high-impact serious gaming platform released by Microsoft Re- search. In Code Hunt, a coding duel consists of two code segments: a secret code segment and a player-visible code segment. To solve a coding duel, a player iteratively modifies the player-visible code segment to match the functional behaviors of the secret code seg- ment. During the duel-solving process, the player is given clues as a set of automatically generated test cases to characterize sample functional behaviors of the secret code segment. The game aspect in Code Hunt is to recognize a pattern from the test cases, and to re-engineer the player-visible code segment to exhibit the expected behaviors. Secure coding duels proposed in this work are coding duels that are carefully designed to train players’ secure coding skills, such as sufficient input validation and access control.
In real world domains, from healthcare to power to finance, we deploy computer systems intended to streamline and im- prove the activities of human agents in the corresponding non-cyber worlds. However, talking to actual users (instead of just computer security experts) reveals endemic circum- vention of the computer-embedded rules. Good-intentioned users, trying to get their jobs done, systematically work around security and other controls embedded in their IT systems.
This poster reports on our work compiling a large corpus of such incidents and developing a model based on semi- otic triads to examine security circumvention. This model suggests that mismorphisms—mappings that fail to preserve structure—lie at the heart of circumvention scenarios; dif- ferential perceptions and needs explain users’ actions. We support this claim with empirical data from the corpus.
Agent-based modeling can serve as a valuable asset to security personnel who wish to better understand the security landscape within their organization, especially as it relates to user behavior and circumvention. In this paper, we ar- gue in favor of cognitive behavioral agent-based modeling for usable security, report on our work on developing an agent- based model for a password management scenario, perform a sensitivity analysis, which provides us with valuable insights into improving security (e.g., an organization that wishes to suppress one form of circumvention may want to endorse another), and provide directions for future work.
In real world domains, from healthcare to power to finance, we deploy computer systems intended to streamline and improve the activities of human agents in the corresponding non-cyber worlds. However, talking to actual users (instead of just computer security experts) reveals endemic circumvention of the computer-embedded rules. Good-intentioned users, trying to get their jobs done, systematically work around security and other controls embedded in their IT systems.
This paper reports on our work compiling a large corpus of such incidents and developing a model based on semiotic triads to examine security circumvention. This model suggests that mismorphisms— mappings that fail to preserve structure—lie at the heart of circumvention scenarios; differential percep- tions and needs explain users’ actions. We support this claim with empirical data from the corpus.
Many current VM monitoring approaches require guest OS modifications and are also unable to perform application level monitoring, reducing their value in a cloud setting. This paper introduces hprobes, a framework that allows one to dynamically monitor applications and operating systems inside a VM. The hprobe framework does not require any changes to the guest OS, which avoids the tight coupling of monitoring with its target. Furthermore, the monitors can be customized and enabled/disabled while the VM is running. To demonstrate the usefulness of this framework, we present three sample detectors: an emergency detector for a security vulnerability, an application watchdog, and an infinite-loop detector. We test our detectors on real applications and demonstrate that those detectors achieve an acceptable level of performance overhead with a high degree of flexibility.
This paper presents the architecture of an end-to-end secu- rity testbed and security analytics framework, which aims to: i) understand real-world exploitation of known security vulnerabilities and ii) preemptively detect multi-stage at- tacks, i.e., before the system misuse. With the increasing number of security vulnerabilities, it is necessary for secu- rity researchers and practitioners to understand: i) system and network behaviors under attacks and ii) potential ef- fects of attacks to the target infrastructure. To safely em- ulate and instrument exploits of known vulnerabilities, we use virtualization techniques to isolate attacks in contain- ers, e.g., Linux-based containers or Virtual Machines, and to deploy monitors, e.g., kernel probes or network packet captures, across a system and network stack. To infer the evolution of attack stages from monitoring data, we use a probabilistic graphical model, namely AttackTagger, that represents learned knowledge of simulated attacks in our se- curity testbed and real-world attacks. Experiments are be- ing run on a real-world deployment of the framework at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign.
Hypervisor activity is designed to be hidden from guest Virtual Machines (VM) as well as external observers. In this paper, we demonstrate that this does not always occur. We present a method by which an external observer can learn sensitive information about hypervisor internals, such as VM scheduling or hypervisor-level monitoring schemes, by observing a VM. We refer to this capability as Hypervisor Introspection (HI).
HI can be viewed as the inverse process of the well-known Virtual Machine Introspection (VMI) technique. VMI is a technique to extract VMs’ internal state from the hypervi- sor, facilitating the implementation of reliability and security monitors[1]. Conversely, HI is a technique that allows VMs to autonomously extract hypervisor information. This capability enables a wide range of attacks, for example, learning a hypervisor’s properties (version, configuration, etc.), defeating hypervisor-level monitoring systems, and compromising the confidentiality of co-resident VMs. This paper focuses on the discovery of a channel to implement HI, and then leveraging that channel for a novel attack against traditional VMI.
In order to perform HI, there must be a method of extracting information from the hypervisor. Since this information is intentionally hidden from a VM, we make use of a side channel. When the hypervisor checks a VM using VMI, VM execution (e.g. network communication between a VM and a remote system) must pause. Therefore, information regarding the hypervisor’s activity can be leaked through this suspension of execution. We call this side channel the VM suspend side channel, illustrated in Fig. 1. As a proof of concept, this paper presents how correlating the results of in-VM micro- benchmarking and out-of-VM reference monitoring can be used to determine when hypervisor-level monitoring tools are vulnerable to attacks.