Biblio | CPS-VO

Scofield, Daniel, Miles, Craig, Kuhn, Stephen. 2017. Fast Model Learning for the Detection of Malicious Digital Documents. Proceedings of the 7th Software Security, Protection, and Reverse Engineering / Software Security and Protection Workshop. :3:1–3:8.

Modern cyber attacks are often conducted by distributing digital documents that contain malware. The approach detailed herein, which consists of a classifier that uses features derived from dynamic analysis of a document viewer as it renders the document in question, is capable of classifying the disposition of digital documents with greater than 98% accuracy even when its model is trained on just small amounts of data. To keep the classification model itself small and thereby to provide scalability, we employ an entity resolution strategy that merges syntactically disparate features that are thought to be semantically equivalent but vary due to programmatic randomness. Entity resolution enables construction of a comprehensive model of benign functionality using relatively few training documents, and the model does not improve significantly with additional training data.

Miramirkhani, N., Appini, M. P., Nikiforakis, N., Polychronakis, M.. 2017. Spotless Sandboxes: Evading Malware Analysis Systems Using Wear-and-Tear Artifacts. 2017 IEEE Symposium on Security and Privacy (SP). :1009–1024.

Malware sandboxes, widely used by antivirus companies, mobile application marketplaces, threat detection appliances, and security researchers, face the challenge of environment-aware malware that alters its behavior once it detects that it is being executed on an analysis environment. Recent efforts attempt to deal with this problem mostly by ensuring that well-known properties of analysis environments are replaced with realistic values, and that any instrumentation artifacts remain hidden. For sandboxes implemented using virtual machines, this can be achieved by scrubbing vendor-specific drivers, processes, BIOS versions, and other VM-revealing indicators, while more sophisticated sandboxes move away from emulation-based and virtualization-based systems towards bare-metal hosts. We observe that as the fidelity and transparency of dynamic malware analysis systems improves, malware authors can resort to other system characteristics that are indicative of artificial environments. We present a novel class of sandbox evasion techniques that exploit the "wear and tear" that inevitably occurs on real systems as a result of normal use. By moving beyond how realistic a system looks like, to how realistic its past use looks like, malware can effectively evade even sandboxes that do not expose any instrumentation indicators, including bare-metal systems. We investigate the feasibility of this evasion strategy by conducting a large-scale study of wear-and-tear artifacts collected from real user devices and publicly available malware analysis services. The results of our evaluation are alarming: using simple decision trees derived from the analyzed data, malware can determine that a system is an artificial environment and not a real user device with an accuracy of 92.86%. As a step towards defending against wear-and-tear malware evasion, we develop statistical models that capture a system's age and degree of use, which can be used to aid sandbox operators in creating system i- ages that exhibit a realistic wear-and-tear state.

Hirotomo, M., Nishio, Y., Kamizono, M., Fukuta, Y., Mohri, M., Shiraishi, Y.. 2017. Efficient Method for Analyzing Malicious Websites by Using Multi-Environment Analysis System. 2017 12th Asia Joint Conference on Information Security (AsiaJCIS). :48–54.

The malicious websites used by drive-by download attacks change their behavior for web client environments. To analyze the behavior of malicious websites, the single-environment analysis cannot obtain sufficient information. Hence, it is difficult to analyze the whole aspect of malicious websites. Also, the code obfuscation and cloaking are used in malicious websites to avoid to be analyzed their behavior. In this paper, we propose an analyzing method that combines decoding of the obfuscation code with dynamic analysis using multi-environment analysis system in order to analyze the behavior of the malicious websites in detail. Furthermore, we present two approaches to improve the multi-environment analysis. The first one is automation of traffic log analysis to reduce the cost of analyzing huge traffic logs between the environments and malicious websites. The second one is multimodal analysis for finding the URL of malicious websites.

Hoole, Alexander M., Traore, Issa, Delaitre, Aurelien, de Oliveira, Charles. 2016. Improving Vulnerability Detection Measurement: [Test Suites and Software Security Assurance]. Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering. :27:1–27:10.

The Software Assurance Metrics and Tool Evaluation (SAMATE) project at the National Institute of Standards and Technology (NIST) has created the Software Assurance Reference Dataset (SARD) to provide researchers and software security assurance tool developers with a set of known security flaws. As part of an empirical evaluation of a runtime monitoring framework, two test suites were executed and monitored, revealing deficiencies which led to a collaboration with the NIST SAMATE team to provide replacements. Test Suites 45 and 46 are analyzed, discussed, and updated to improve accuracy, consistency, preciseness, and automation. Empirical results show metrics such as recall, precision, and F-Measure are all impacted by invalid base assumptions regarding the test suites.

Hoole, Alexander M., Traore, Issa, Delaitre, Aurelien, de Oliveira, Charles. 2016. Improving Vulnerability Detection Measurement: [Test Suites and Software Security Assurance]. Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering. :27:1–27:10.

The Software Assurance Metrics and Tool Evaluation (SAMATE) project at the National Institute of Standards and Technology (NIST) has created the Software Assurance Reference Dataset (SARD) to provide researchers and software security assurance tool developers with a set of known security flaws. As part of an empirical evaluation of a runtime monitoring framework, two test suites were executed and monitored, revealing deficiencies which led to a collaboration with the NIST SAMATE team to provide replacements. Test Suites 45 and 46 are analyzed, discussed, and updated to improve accuracy, consistency, preciseness, and automation. Empirical results show metrics such as recall, precision, and F-Measure are all impacted by invalid base assumptions regarding the test suites.

Tromer, Eran, Schuster, Roei. 2016. DroidDisintegrator: Intra-Application Information Flow Control in Android Apps. Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security. :401–412.

In mobile platforms and their app markets, controlling app permissions and preventing abuse of private information are crucial challenges. Information Flow Control (IFC) is a powerful approach for formalizing and answering user concerns such as: "Does this app send my geolocation to the Internet?" Yet despite intensive research efforts, IFC has not been widely adopted in mainstream programming practice. Abstract We observe that the typical structure of Android apps offers an opportunity for a novel and effective application of IFC. In Android, an app consists of a collection of a few dozen "components", each in charge of some high-level functionality. Most components do not require access to most resources. These components are a natural and effective granularity at which to apply IFC (as opposed to the typical process-level or language-level granularity). By assigning different permission labels to each component, and limiting information flow between components, it is possible to express and enforce IFC constraints. Yet nuances of the Android platform, such as its multitude of discretionary (and somewhat arcane) communication channels, raise challenges in defining and enforcing component boundaries. Abstract We build a system, DroidDisintegrator, which demonstrates the viability of component-level IFC for expressing and controlling app behavior. DroidDisintegrator uses dynamic analysis to generate IFC policies for Android apps, repackages apps to embed these policies, and enforces the policies at runtime. We evaluate DroidDisintegrator on dozens of apps.

Costin, Andrei, Zarras, Apostolis, Francillon, Aurélien. 2016. Automated Dynamic Firmware Analysis at Scale: A Case Study on Embedded Web Interfaces. Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security. :437–448.

Embedded devices are becoming more widespread, interconnected, and web-enabled than ever. However, recent studies showed that embedded devices are far from being secure. Moreover, many embedded systems rely on web interfaces for user interaction or administration. Web security is still difficult and therefore the web interfaces of embedded systems represent a considerable attack surface. In this paper, we present the first fully automated framework that applies dynamic firmware analysis techniques to achieve, in a scalable manner, automated vulnerability discovery within embedded firmware images. We apply our framework to study the security of embedded web interfaces running in Commercial Off-The-Shelf (COTS) embedded devices, such as routers, DSL/cable modems, VoIP phones, IP/CCTV cameras. We introduce a methodology and implement a scalable framework for discovery of vulnerabilities in embedded web interfaces regardless of the devices' vendor, type, or architecture. To reach this goal, we perform full system emulation to achieve the execution of firmware images in a software-only environment, i.e., without involving any physical embedded devices. Then, we automatically analyze the web interfaces within the firmware using both static and dynamic analysis tools. We also present some interesting case-studies and discuss the main challenges associated with the dynamic analysis of firmware images and their web interfaces and network services. The observations we make in this paper shed light on an important aspect of embedded devices which was not previously studied at a large scale.

Vemparala, Swapna, Di Troia, Fabio, Corrado, Visaggio Aaron, Austin, Thomas H., Stamo, Mark. 2016. Malware Detection Using Dynamic Birthmarks. Proceedings of the 2016 ACM on International Workshop on Security And Privacy Analytics. :41–46.

In this paper, we compare the effectiveness of Hidden Markov Models (HMMs) with that of Profile Hidden Markov Models (PHMMs), where both are trained on sequences of API calls. We compare our results to static analysis using HMMs trained on sequences of opcodes, and show that dynamic analysis achieves significantly stronger results in many cases. Furthermore, in comparing our two dynamic analysis approaches, we find that using PHMMs consistently outperforms our technique based on HMMs.

Squires, Walter, Centonze, Paolina. 2016. Cross-platform Access-rights Analysis of Mobile Applications. Proceedings of the International Conference on Mobile Software Engineering and Systems. :295–296.

We live in the era of mobile computing. Mobile devices have more sensors and more capabilities than desktop computers. For any computing device that contains sensitive information and accesses the Internet, security is a major concern for both enterprises and end-users. Of the mobile devices commonly in The emphasis of this research focuses on to the ways in which the popular iOS and Android platforms handle permissions in an attempt to discern if there are any identifiable trends on either platform w.r.t. applications being over- or underprivileged.

Kwon, Yonghwi, Kim, Dohyeong, Sumner, William Nick, Kim, Kyungtae, Saltaformaggio, Brendan, Zhang, Xiangyu, Xu, Dongyan. 2016. LDX: Causality Inference by Lightweight Dual Execution. Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems. :503–515.

Causality inference, such as dynamic taint anslysis, has many applications (e.g., information leak detection). It determines whether an event e is causally dependent on a preceding event c during execution. We develop a new causality inference engine LDX. Given an execution, it spawns a slave execution, in which it mutates c and observes whether any change is induced at e. To preclude non-determinism, LDX couples the executions by sharing syscall outcomes. To handle path differences induced by the perturbation, we develop a novel on-the-fly execution alignment scheme that maintains a counter to reflect the progress of execution. The scheme relies on program analysis and compiler transformation. LDX can effectively detect information leak and security attacks with an average overhead of 6.08% while running the master and the slave concurrently on separate CPUs, much lower than existing systems that require instruction level monitoring. Furthermore, it has much better accuracy in causality inference.

Masood, A., Java, J.. 2015. Static analysis for web service security - Tools amp; techniques for a secure development life cycle. 2015 IEEE International Symposium on Technologies for Homeland Security (HST). :1–6.

In this ubiquitous IoT (Internet of Things) era, web services have become a vital part of today's critical national and public sector infrastructure. With the industry wide adaptation of service-oriented architecture (SOA), web services have become an integral component of enterprise software eco-system, resulting in new security challenges. Web services are strategic components used by wide variety of organizations for information exchange on the internet scale. The public deployments of mission critical APIs opens up possibility of software bugs to be maliciously exploited. Therefore, vulnerability identification in web services through static as well as dynamic analysis is a thriving and interesting area of research in academia, national security and industry. Using OWASP (Open Web Application Security Project) web services guidelines, this paper discusses the challenges of existing standards, and reviews new techniques and tools to improve services security by detecting vulnerabilities. Recent vulnerabilities like Shellshock and Heartbleed has shifted the focus of risk assessment to the application layer, which for majority of organization means public facing web services and web/mobile applications. RESTFul services have now become the new service development paradigm normal; therefore SOAP centric standards such as XML Encryption, XML Signature, WS-Security, and WS-SecureConversation are nearly not as relevant. In this paper we provide an overview of the OWASP top 10 vulnerabilities for web services, and discuss the potential static code analysis techniques to discover these vulnerabilities. The paper reviews the security issues targeting web services, software/program verification and security development lifecycle.

Derhab, A., Bouras, A., Bin Muhaya, F., Khan, M.K., Yang Xiang. 2014. Spam Trapping System: Novel security framework to fight against spam botnets. Telecommunications (ICT), 2014 21st International Conference on. :467-471.

In this paper, we inspire from two analogies: the warfare kill zone and the airport check-in system, to tackle the issue of spam botnet detection. We add a new line of defense to the defense-in-depth model called the third line. This line is represented by a security framework, named the Spam Trapping System (STS) and adopts the prevent-then-detect approach to fight against spam botnets. The framework exploits the application sandboxing principle to prevent the spam from going out of the host and detect the corresponding malware bot. We show that the proposed framework can ensure better security against malware bots. In addition, an analytical study demonstrates that the framework offers optimal performance in terms of detection time and computational cost in comparison to intrusion detection systems based on static and dynamic analysis.

Cam, H., Mouallem, P., Yilin Mo, Sinopoli, B., Nkrumah, B.. 2014. Modeling impact of attacks, recovery, and attackability conditions for situational awareness. Cognitive Methods in Situation Awareness and Decision Support (CogSIMA), 2014 IEEE International Inter-Disciplinary Conference on. :181-187.

A distributed cyber control system comprises various types of assets, including sensors, intrusion detection systems, scanners, controllers, and actuators. The modeling and analysis of these components usually require multi-disciplinary approaches. This paper presents a modeling and dynamic analysis of a distributed cyber control system for situational awareness by taking advantage of control theory and time Petri net. Linear time-invariant systems are used to model the target system, attacks, assets influences, and an anomaly-based intrusion detection system. Time Petri nets are used to model the impact and timing relationships of attacks, vulnerability, and recovery at every node. To characterize those distributed control systems that are perfectly attackable, algebraic and topological attackability conditions are derived. Numerical evaluation is performed to determine the impact of attacks on distributed control system.

Kumar, S., Rama Krishna, C., Aggarwal, N., Sehgal, R., Chamotra, S.. 2014. Malicious data classification using structural information and behavioral specifications in executables. Engineering and Computational Sciences (RAECS), 2014 Recent Advances in. :1-6.

With the rise in the underground Internet economy, automated malicious programs popularly known as malwares have become a major threat to computers and information systems connected to the internet. Properties such as self healing, self hiding and ability to deceive the security devices make these software hard to detect and mitigate. Therefore, the detection and the mitigation of such malicious software is a major challenge for researchers and security personals. The conventional systems for the detection and mitigation of such threats are mostly signature based systems. Major drawback of such systems are their inability to detect malware samples for which there is no signature available in their signature database. Such malwares are known as zero day malware. Moreover, more and more malware writers uses obfuscation technology such as polymorphic and metamorphic, packing, encryption, to avoid being detected by antivirus. Therefore, the traditional signature based detection system is neither effective nor efficient for the detection of zero-day malware. Hence to improve the effectiveness and efficiency of malware detection system we are using classification method based on structural information and behavioral specifications. In this paper we have used both static and dynamic analysis approaches. In static analysis we are extracting the features of an executable file followed by classification. In dynamic analysis we are taking the traces of executable files using NtTrace within controlled atmosphere. Experimental results obtained from our algorithm indicate that our proposed algorithm is effective in extracting malicious behavior of executables. Further it can also be used to detect malware variants.

Salman, A., Elhajj, I.H., Chehab, A., Kayssi, A.. 2014. DAIDS: An Architecture for Modular Mobile IDS. Advanced Information Networking and Applications Workshops (WAINA), 2014 28th International Conference on. :328-333.

The popularity of mobile devices and the enormous number of third party mobile applications in the market have naturally lead to several vulnerabilities being identified and abused. This is coupled with the immaturity of intrusion detection system (IDS) technology targeting mobile devices. In this paper we propose a modular host-based IDS framework for mobile devices that uses behavior analysis to profile applications on the Android platform. Anomaly detection can then be used to categorize malicious behavior and alert users. The proposed system accommodates different detection algorithms, and is being tested at a major telecom operator in North America. This paper highlights the architecture, findings, and lessons learned.

Cam, H., Mouallem, P., Yilin Mo, Sinopoli, B., Nkrumah, B.. 2014. Modeling impact of attacks, recovery, and attackability conditions for situational awareness. Cognitive Methods in Situation Awareness and Decision Support (CogSIMA), 2014 IEEE International Inter-Disciplinary Conference on. :181-187.

A distributed cyber control system comprises various types of assets, including sensors, intrusion detection systems, scanners, controllers, and actuators. The modeling and analysis of these components usually require multi-disciplinary approaches. This paper presents a modeling and dynamic analysis of a distributed cyber control system for situational awareness by taking advantage of control theory and time Petri net. Linear time-invariant systems are used to model the target system, attacks, assets influences, and an anomaly-based intrusion detection system. Time Petri nets are used to model the impact and timing relationships of attacks, vulnerability, and recovery at every node. To characterize those distributed control systems that are perfectly attackable, algebraic and topological attackability conditions are derived. Numerical evaluation is performed to determine the impact of attacks on distributed control system.

Schwartz, E.J., Avgerinos, T., Brumley, D.. 2010. All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask). Security and Privacy (SP), 2010 IEEE Symposium on. :317-331.

Dynamic taint analysis and forward symbolic execution are quickly becoming staple techniques in security analyses. Example applications of dynamic taint analysis and forward symbolic execution include malware analysis, input filter generation, test case generation, and vulnerability discovery. Despite the widespread usage of these two techniques, there has been little effort to formally define the algorithms and summarize the critical issues that arise when these techniques are used in typical security contexts. The contributions of this paper are two-fold. First, we precisely describe the algorithms for dynamic taint analysis and forward symbolic execution as extensions to the run-time semantics of a general language. Second, we highlight important implementation choices, common pitfalls, and considerations when using these techniques in a security context.

Rossow, C., Dietrich, C.J., Grier, C., Kreibich, C., Paxson, V., Pohlmann, N., Bos, H., van Steen, M.. 2012. Prudent Practices for Designing Malware Experiments: Status Quo and Outlook. Security and Privacy (SP), 2012 IEEE Symposium on. :65-79.

Malware researchers rely on the observation of malicious code in execution to collect datasets for a wide array of experiments, including generation of detection models, study of longitudinal behavior, and validation of prior research. For such research to reflect prudent science, the work needs to address a number of concerns relating to the correct and representative use of the datasets, presentation of methodology in a fashion sufficiently transparent to enable reproducibility, and due consideration of the need not to harm others. In this paper we study the methodological rigor and prudence in 36 academic publications from 2006-2011 that rely on malware execution. 40% of these papers appeared in the 6 highest-ranked academic security conferences. We find frequent shortcomings, including problematic assumptions regarding the use of execution-driven datasets (25% of the papers), absence of description of security precautions taken during experiments (71% of the articles), and oftentimes insufficient description of the experimental setup. Deficiencies occur in top-tier venues and elsewhere alike, highlighting a need for the community to improve its handling of malware datasets. In the hope of aiding authors, reviewers, and readers, we frame guidelines regarding transparency, realism, correctness, and safety for collecting and using malware datasets.