Visible to the public Biblio

Filters: Author is Yao, Danfeng (Daphne)  [Clear All Filters]
2019-05-08
Yao, Danfeng(Daphne).  2018.  Data Breach and Multiple Points to Stop It. Proceedings of the 23Nd ACM on Symposium on Access Control Models and Technologies. :1–1.
Preventing unauthorized access to sensitive data is an exceedingly complex access control problem. In this keynote, I will break down the data breach problem and give insights into how organizations could and should do to reduce their risks. The talk will start with discussing the technical reasons behind some of the recent high-profile data breach incidents (e.g., in Equifax, Target), as well as pointing out the threats of inadvertent or accidental data leaks. Then, I will show that there are usually multiple points to stop data breach and give an overview of the relevant state-of-the-art solutions. I will focus on some of the recent algorithmic advances in preventing inadvertent data loss, including set-based and alignment-based screening techniques, outsourced screening, and GPU-based performance acceleration. I will also briefly discuss the role of non-technical factors (e.g., organizational culture on security) in data protection. Because of the cat-and-mouse-game nature of cybersecurity, achieving absolute data security is impossible. However, proactively securing critical data paths through strategic planning and placement of security tools will help reduce the risks. I will also point out a few exciting future research directions, e.g., on data leak detection as a cloud security service and deep learning for reducing false alarms in continuous authentication and the prickly insider-threat detection.
2018-11-19
Kedrowitsch, Alexander, Yao, Danfeng(Daphne), Wang, Gang, Cameron, Kirk.  2017.  A First Look: Using Linux Containers for Deceptive Honeypots. Proceedings of the 2017 Workshop on Automated Decision Making for Active Cyber Defense. :15–22.

The ever-increasing sophistication of malware has made malicious binary collection and analysis an absolute necessity for proactive defenses. Meanwhile, malware authors seek to harden their binaries against analysis by incorporating environment detection techniques, in order to identify if the binary is executing within a virtual environment or in the presence of monitoring tools. For security researchers, it is still an open question regarding how to remove the artifacts from virtual machines to effectively build deceptive "honeypots" for malware collection and analysis. In this paper, we explore a completely different and yet promising approach by using Linux containers. Linux containers, in theory, have minimal virtualization artifacts and are easily deployable on low-power devices. Our work performs the first controlled experiments to compare Linux containers with bare metal and 5 major types of virtual machines. We seek to measure the deception capabilities offered by Linux containers to defeat mainstream virtual environment detection techniques. In addition, we empirically explore the potential weaknesses in Linux containers to help defenders to make more informed design decisions.

2017-06-05
Zhang, Hao, Yao, Danfeng(Daphne), Ramakrishnan, Naren.  2016.  Causality-based Sensemaking of Network Traffic for Android Application Security. Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security. :47–58.

Malicious Android applications pose serious threats to mobile security. They threaten the data confidentiality and system integrity on Android devices. Monitoring runtime activities serves as an important technique for analyzing dynamic app behaviors. We design a triggering relation model for dynamically analyzing network traffic on Android devices. Our model enables one to infer the dependency of outbound network requests from the device. We describe a new machine learning approach for discovering the dependency of network requests. These request-level dependence relations are used to detect stealthy malware activities. Malicious requests are identified due to the lack of dependency with legitimate triggers. Our prototype is evaluated on 14GB network traffic data and system logs collected from an Android tablet. Experimental results show that our solution achieves a high accuracy (99.1%) in detecting malicious requests sent from new malicious apps.

2017-05-16
Yu, Xiaodong, Feng, Wu-chun, Yao, Danfeng(Daphne), Becchi, Michela.  2016.  O3FA: A Scalable Finite Automata-based Pattern-Matching Engine for Out-of-Order Deep Packet Inspection. Proceedings of the 2016 Symposium on Architectures for Networking and Communications Systems. :1–11.

To match the signatures of malicious traffic across packet boundaries, network-intrusion detection (and prevention) systems (NIDS) typically perform pattern matching after flow reassembly or packet reordering. However, this may lead to the need for large packet buffers, making detection vulnerable to denial-of-service (DoS) attacks, whereby attackers exhaust the buffer capacity by sending long sequences of out-of-order packets. While researchers have proposed solutions for exact-match patterns, regular-expression matching on out-of-order packets is still an open problem. Specifically, a key challenge is the matching of complex sub-patterns (such as repetitions of wildcards matched at the boundary between packets). Our proposed approach leverages the insight that various segments matching the same repetitive sub-pattern are logically equivalent to the regular-expression matching engine, and thus, inter-changing them would not affect the final result. In this paper, we present O3FA, a new finite automata-based, deep packet-inspection engine to perform regular-expression matching on out-of-order packets without requiring flow reassembly. O3FA consists of a deterministic finite automaton (FA) coupled with a set of prefix-/suffix-FA, which allows processing out-of-order packets on the fly. We present our design, optimization, and evaluation for the O3FA engine. Our experiments show that our design requires 20x-4000x less buffer space than conventional buffering-and-reassembling schemes on various datasets and that it can process packets in real-time, i.e., without reassembly.