Visible to the public Biblio

Found 16998 results

2017-03-29
Aono, Yoshinori, Hayashi, Takuya, Trieu Phong, Le, Wang, Lihua.  2016.  Scalable and Secure Logistic Regression via Homomorphic Encryption. Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy. :142–144.

Logistic regression is a powerful machine learning tool to classify data. When dealing with sensitive data such as private or medical information, cares are necessary. In this paper, we propose a secure system for protecting the training data in logistic regression via homomorphic encryption. Perhaps surprisingly, despite the non-polynomial tasks of training in logistic regression, we show that only additively homomorphic encryption is needed to build our system. Our system is secure and scalable with the dataset size.

White, Martin, Tufano, Michele, Vendome, Christopher, Poshyvanyk, Denys.  2016.  Deep Learning Code Fragments for Code Clone Detection. Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. :87–98.

Code clone detection is an important problem for software maintenance and evolution. Many approaches consider either structure or identifiers, but none of the existing detection techniques model both sources of information. These techniques also depend on generic, handcrafted features to represent code fragments. We introduce learning-based detection techniques where everything for representing terms and fragments in source code is mined from the repository. Our code analysis supports a framework, which relies on deep learning, for automatically linking patterns mined at the lexical level with patterns mined at the syntactic level. We evaluated our novel learning-based approach for code clone detection with respect to feasibility from the point of view of software maintainers. We sampled and manually evaluated 398 file- and 480 method-level pairs across eight real-world Java systems; 93% of the file- and method-level samples were evaluated to be true positives. Among the true positives, we found pairs mapping to all four clone types. We compared our approach to a traditional structure-oriented technique and found that our learning-based approach detected clones that were either undetected or suboptimally reported by the prominent tool Deckard. Our results affirm that our learning-based approach is suitable for clone detection and a tenable technique for researchers.

Al-Sibahi, Ahmad Salim, Dimovski, Aleksandar S., Wąsowski, Andrzej.  2016.  Symbolic Execution of High-level Transformations. Proceedings of the 2016 ACM SIGPLAN International Conference on Software Language Engineering. :207–220.

Transformations form an important part of developing domain specific languages, where they are used to provide semantics for typing and evaluation. Yet, few solutions exist for verifying transformations written in expressive high-level transformation languages. We take a step towards that goal, by developing a general symbolic execution technique that handles programs written in these high-level transformation languages. We use logical constraints to describe structured symbolic values, including containment, acyclicity, simple unordered collections (sets) and to handle deep type-based querying of syntax hierarchies. We evaluate this symbolic execution technique on a collection of refactoring and model transformation programs, showing that the white-box test generation tool based on symbolic execution obtains better code coverage than a black box test generator for such programs in almost all tested cases.

Afshari, Mehrdad, Su, Zhendong.  2016.  Building White-box Abstractions by Program Refinement. Proceedings of the 2016 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software. :74–81.

Abstractions make building complex systems possible. Many facilities provided by a modern programming language are directly designed to build a certain style of abstraction. Abstractions also aim to enhance code reusability, thus enhancing programmer productivity and effectiveness. Real-world software systems can grow to have a complicated hierarchy of abstractions. Often, the hierarchy grows unnecessarily deep, because the programmers have envisioned the most generic use cases for a piece of code to make it reusable. Sometimes, the abstractions used in the program are not the appropriate ones, and it would be simpler for the higher level client to circumvent such abstractions. Another problem is the impedance mismatch between different pieces of code or libraries coming from different projects that are not designed to work together. Interoperability between such libraries are often hindered by abstractions, by design, in the name of hiding implementation details and encapsulation. These problems necessitate forms of abstraction that are easy to manipulate if needed. In this paper, we describe a powerful mechanism to create white-box abstractions, that encourage flatter hierarchies of abstraction and ease of manipulation and customization when necessary: program refinement. In so doing, we rely on the basic principle that writing directly in the host programming language is as least restrictive as one can get in terms of expressiveness, and allow the programmer to reuse and customize existing code snippets to address their specific needs.

Shi, Yang, Wei, Wujing, He, Zongjian, Fan, Hongfei.  2016.  An Ultra-lightweight White-box Encryption Scheme for Securing Resource-constrained IoT Devices. Proceedings of the 32Nd Annual Conference on Computer Security Applications. :16–29.

Embedded devices with constrained computational resources, such as wireless sensor network nodes, electronic tag readers, roadside units in vehicular networks, and smart watches and wristbands, are widely used in the Internet of Things. Many of such devices are deployed in untrustable environments, and others may be easy to lose, leading to possible capture by adversaries. Accordingly, in the context of security research, these devices are running in the white-box attack context, where the adversary may have total visibility of the implementation of the built-in cryptosystem with full control over its execution. It is undoubtedly a significant challenge to deal with attacks from a powerful adversary in white-box attack contexts. Existing encryption algorithms for white-box attack contexts typically require large memory use, varying from one to dozens of megabytes, and thus are not suitable for resource-constrained devices. As a countermeasure in such circumstances, we propose an ultra-lightweight encryption scheme for protecting the confidentiality of data in white-box attack contexts. The encryption is executed with secret components specialized for resource-constrained devices against white-box attacks, and the encryption algorithm requires a relatively small amount of static data, ranging from 48 to 92 KB. The security and efficiency of the proposed scheme have been theoretically analyzed with positive results, and experimental evaluations have indicated that the scheme satisfies the resource constraints in terms of limited memory use and low computational cost.

Lou, Jian, Vorobeychik, Yevgeniy.  2016.  Decentralization and Security in Dynamic Traffic Light Control. Proceedings of the Symposium and Bootcamp on the Science of Security. :90–92.

Complex traffic networks include a number of controlled intersections, and, commonly, multiple districts or municipalities. The result is that the overall traffic control problem is extremely complex computationally. Moreover, given that different municipalities may have distinct, non-aligned, interests, traffic light controller design is inherently decentralized, a consideration that is almost entirely absent from related literature. Both complexity and decentralization have great bearing both on the quality of the traffic network overall, as well as on its security. We consider both of these issues in a dynamic traffic network. First, we propose an effective local search algorithm to efficiently design system-wide control logic for a collection of intersections. Second, we propose a game theoretic (Stackelberg game) model of traffic network security in which an attacker can deploy denial-of-service attacks on sensors, and develop a resilient control algorithm to mitigate such threats. Finally, we propose a game theoretic model of decentralization, and investigate this model both in the context of baseline traffic network design, as well as resilient design accounting for attacks. Our methods are implemented and evaluated using a simple traffic network scenario in SUMO.

Hasegawa, Toru, Tara, Yasutaka, Ryu, Kai, Koizumi, Yuki.  2016.  Emergency Message Delivery Mechanism in NDN Networks. Proceedings of the 3rd ACM Conference on Information-Centric Networking. :199–200.

Emergency message delivery in packet networks is promising in terms of resiliency to failures and service delivery to handicapped persons. In this paper, we propose an NDN(Named Data Networking)-based emergency message delivery mechanism by leveraging multicasting and ABE (Attribute-Based Encryption) functions.

Rajabi, Arezoo, Bobba, Rakesh B..  2016.  A Resilient Algorithm for Power System Mode Estimation Using Synchrophasors. Proceedings of the 2Nd Annual Industrial Control System Security Workshop. :23–29.

Bulk electric systems include hundreds of synchronous generators. Faults in such systems can induce oscillations in the generators which if not detected and controlled can destabilize the system. Mode estimation is a popular method for oscillation detection. In this paper, we propose a resilient algorithm to estimate electro-mechanical oscillation modes in large scale power system in the presence of false data. In particular, we add a fault tolerance mechanism to a variant of alternating direction method of multipliers (ADMM) called S-ADMM. We evaluate our method on an IEEE 68-bus test system under different attack scenarios and show that in all the scenarios our algorithm converges well.

Chlela, Martine, Joos, Geza, Kassouf, Marthe.  2016.  Impact of Cyber-attacks on Islanded Microgrid Operation. Proceedings of the Workshop on Communications, Computation and Control for Resilient Smart Energy Systems. :1:1–1:5.

The prevalent integration of highly intermittent renewable distributed energy resources (DER) into microgrids necessitates the deployment of a microgrid controller. In the absence of the main electric grid setting the network voltage and frequency, the microgrid power and energy management becomes more challenging, accentuating the need for a centralized microgrid controller that, through communication links, ensures smooth operation of the autonomous system. This extensive reliance on information and communication technologies (ICT) creates potential access points and vulnerabilities that may be exploited by cyber-attackers. This paper first presents a typical microgrid configuration operating in islanded mode; the microgrid elements, primary and secondary control functions for power, energy and load management are defined. The information transferred from the central controller to coordinate and dispatch the DERs is provided along with the deployable communication technologies and protocols. The vulnerabilities arising in such microgrids along with the cyber-attacks exploiting them are described. The impact of these attacks on the microgrid controller functions was shown to be dependent on the characteristics, location and target of the cyber-attack, as well as the microgrid configuration and control. A real-time hardware-in-the loop (HIL) testing platform, which emulates a microgrid featuring renewable DERs, an energy storage system (ESS), a diesel generator and controllable loads was used as the case study in order to demonstrate the impact of various cyber-attacks.

Ibrahim, Ahmad, Sadeghi, Ahmad-Reza, Tsudik, Gene, Zeitouni, Shaza.  2016.  DARPA: Device Attestation Resilient to Physical Attacks. Proceedings of the 9th ACM Conference on Security & Privacy in Wireless and Mobile Networks. :171–182.

As embedded devices (under the guise of "smart-whatever") rapidly proliferate into many domains, they become attractive targets for malware. Protecting them from software and physical attacks becomes both important and challenging. Remote attestation is a basic tool for mitigating such attacks. It allows a trusted party (verifier) to remotely assess software integrity of a remote, untrusted, and possibly compromised, embedded device (prover). Prior remote attestation methods focus on software (malware) attacks in a one-verifier/one-prover setting. Physical attacks on provers are generally ruled out as being either unrealistic or impossible to mitigate. In this paper, we argue that physical attacks must be considered, particularly, in the context of many provers, e.g., a network, of devices. As- suming that physical attacks require capture and subsequent temporary disablement of the victim device(s), we propose DARPA, a light-weight protocol that takes advantage of absence detection to identify suspected devices. DARPA is resilient against a very strong adversary and imposes minimal additional hardware requirements. We justify and identify DARPA's design goals and evaluate its security and costs.

Ghosh, Uttam, Dong, Xinshu, Tan, Rui, Kalbarczyk, Zbigniew, Yau, David K.Y., Iyer, Ravishankar K..  2016.  A Simulation Study on Smart Grid Resilience Under Software-Defined Networking Controller Failures. Proceedings of the 2Nd ACM International Workshop on Cyber-Physical System Security. :52–58.

Riding on the success of SDN for enterprise and data center networks, recently researchers have shown much interest in applying SDN for critical infrastructures. A key concern, however, is the vulnerability of the SDN controller as a single point of failure. In this paper, we develop a cyber-physical simulation platform that interconnects Mininet (an SDN emulator), hardware SDN switches, and PowerWorld (a high-fidelity, industry-strength power grid simulator). We report initial experiments on how a number of representative controller faults may impact the delay of smart grid communications. We further evaluate how this delay may affect the performance of the underlying physical system, namely automatic gain control (AGC) as a fundamental closed-loop control that regulates the grid frequency to a critical nominal value. Our results show that when the fault-induced delay reaches seconds (e.g., more than four seconds in some of our experiments), degradation of the AGC becomes evident. Particularly, the AGC is most vulnerable when it is in a transient following say step changes in loading, because the significant state fluctuations will exacerbate the effects of using a stale system state in the control.

Nicol, David M., Kumar, Rakesh.  2016.  Efficient Monte Carlo Evaluation of SDN Resiliency. Proceedings of the 2016 Annual ACM Conference on SIGSIM Principles of Advanced Discrete Simulation. :143–152.

Software defined networking (SDN) is an emerging technology for controlling flows through networks. Used in the context of industrial control systems, an objective is to design configurations that have built-in protection for hardware failures in the sense that the configuration has "baked-in" back-up routes. The objective is to leave the configuration static as long as possible, minimizing the need to have the controller push in new routing and filtering rules We have designed and implemented a tool that enables us to determine the complete connectivity map from an analysis of all switch configurations in the network. We can use this tool to explore the impact of a link failure, in particular to determine whether the failure induces loss of the ability to deliver a flow even after the built-in back-up routes are used. A measure of the original configuration's resilience to link failure is the mean number of link failures required to induce the first such loss of service. The computational cost of each link failure and subsequent analysis is large, so there is much to be gained by reducing the overall cost of obtaining a statistically valid estimate of resiliency. This paper shows that when analysis of a network state can identify all as-yet-unfailed links any one of whose failure would induce loss of a flow, then we can use the technique of importance sampling to estimate the mean number of links required to fail before some flow is lost, and analyze the potential for reducing the variance of the sample statistic. We provide both theoretical and empirical evidence for significant variance reduction.

Nisha, Dave, M..  2016.  Storage as a parameter for classifying dynamic key management schemes proposed for WSNs. 2016 International Conference on Computational Techniques in Information and Communication Technologies (ICCTICT). :51–56.

Real world applications of Wireless Sensor Networks such as border control, healthcare monitoring and target tracking require secure communications. Thus, during WSN setup, one of the first requirements is to distribute the keys to the sensor nodes which can be later used for securing the messages exchanged between sensors. The key management schemes in WSN secure the communication between a pair or a group of nodes. However, the storage capacity of the sensor nodes is limited which makes storage requirement as an important parameter for the evaluation of key management schemes. This paper classifies the existing key management schemes proposed for WSNs into three categories: storage inefficient, storage efficient and highly storage efficient key management schemes.

Kosek, A. M..  2016.  Contextual anomaly detection for cyber-physical security in Smart Grids based on an artificial neural network model. 2016 Joint Workshop on Cyber- Physical Security and Resilience in Smart Grids (CPSR-SG). :1–6.

This paper presents a contextual anomaly detection method and its use in the discovery of malicious voltage control actions in the low voltage distribution grid. The model-based anomaly detection uses an artificial neural network model to identify a distributed energy resource's behaviour under control. An intrusion detection system observes distributed energy resource's behaviour, control actions and the power system impact, and is tested together with an ongoing voltage control attack in a co-simulation set-up. The simulation results obtained with a real photovoltaic rooftop power plant data show that the contextual anomaly detection performs on average 55% better in the control detection and over 56% better in the malicious control detection over the point anomaly detection.

Willers, Oliver, Huth, Christopher, Guajardo, Jorge, Seidel, Helmut.  2016.  MEMS Gyroscopes As Physical Unclonable Functions. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. :591–602.

A key requirement for most security solutions is to provide secure cryptographic key storage in a way that will easily scale in the age of the Internet of Things. In this paper, we focus on providing such a solution based on Physical Unclonable Functions (PUFs). To this end, we focus on microelectromechanical systems (MEMS)-based gyroscopes and show via wafer-level measurements and simulations, that it is feasible to use the physical and electrical properties of these sensors for cryptographic key generation. After identifying the most promising features, we propose a novel quantization scheme to extract bit strings from the MEMS analog measurements. We provide upper and lower bounds for the minimum entropy of the derived bit strings and fully analyze the intra- and inter-class distributions across the operation range of the MEMS device. We complement these measurements via Monte-Carlo simulations based on the distributions of the parameters measured on actual devices. We also propose and evaluate a complete cryptographic key generation chain based on fuzzy extractors. We derive a full entropy 128-bit key using the obtained min-entropy estimates, requiring 1219 bits of helper data with an (authentication) failure probability of 4 . 10-7. In addition, we propose a dedicated MEMS-PUF design, which is superior to our measured sensor, in terms of chip area, quality and quantity of key seed features.

Grubbs, Paul, McPherson, Richard, Naveed, Muhammad, Ristenpart, Thomas, Shmatikov, Vitaly.  2016.  Breaking Web Applications Built On Top of Encrypted Data. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. :1353–1364.

We develop a systematic approach for analyzing client-server applications that aim to hide sensitive user data from untrusted servers. We then apply it to Mylar, a framework that uses multi-key searchable encryption (MKSE) to build Web applications on top of encrypted data. We demonstrate that (1) the Popa-Zeldovich model for MKSE does not imply security against either passive or active attacks; (2) Mylar-based Web applications reveal users' data and queries to passive and active adversarial servers; and (3) Mylar is generically insecure against active attacks due to system design flaws. Our results show that the problem of securing client-server applications against actively malicious servers is challenging and still unsolved. We conclude with general lessons for the designers of systems that rely on property-preserving or searchable encryption to protect data from untrusted servers.

Mavromoustakos, Stephanos, Patel, Aakash, Chaudhary, Kinjal, Chokshi, Parth, Patel, Shaili.  2016.  Causes and Prevention of SQL Injection Attacks in Web Applications. Proceedings of the 4th International Conference on Information and Network Security. :55–59.

SQL injection is one of the major threats to the security of the web applications. Attackers try to gain unauthorized access to the database, which has vital and private information of the users. Many researchers have provided various techniques and practices to protect the web applications from attackers. There is a plethora of techniques available to perform SQL injection and usually not everyone is familiar with every attack. Hence, this kind of attack is still the most prevalent. In this paper, we have presented the types of SQL injections attacks and most dominant ways to prevent them.

Harshaw, Christopher R., Bridges, Robert A., Iannacone, Michael D., Reed, Joel W., Goodall, John R..  2016.  GraphPrints: Towards a Graph Analytic Method for Network Anomaly Detection. Proceedings of the 11th Annual Cyber and Information Security Research Conference. :15:1–15:4.

This paper introduces a novel graph-analytic approach for detecting anomalies in network flow data called GraphPrints. Building on foundational network-mining techniques, our method represents time slices of traffic as a graph, then counts graphlets–-small induced subgraphs that describe local topology. By performing outlier detection on the sequence of graphlet counts, anomalous intervals of traffic are identified, and furthermore, individual IPs experiencing abnormal behavior are singled-out. Initial testing of GraphPrints is performed on real network data with an implanted anomaly. Evaluation shows false positive rates bounded by 2.84% at the time-interval level, and 0.05% at the IP-level with 100% true positive rates at both.

Sze, Wai Kit, Srivastava, Abhinav, Sekar, R..  2016.  Hardening OpenStack Cloud Platforms Against Compute Node Compromises. Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security. :341–352.

Infrastructure-as-a-Service (IaaS) clouds such as OpenStack consist of two kinds of nodes in their infrastructure: control nodes and compute nodes. While control nodes run all critical services, compute nodes host virtual machines of customers. Given the large number of compute nodes, and the fact that they are hosting VMs of (possibly malicious) customers, it is possible that some of the compute nodes may be compromised. This paper examines the impact of such a compromise. We focus on OpenStack, a popular open-source cloud plat- form that is widely adopted. We show that attackers com- promising a single compute node can extend their controls over the entire cloud infrastructure. They can then gain free access to resources that they have not paid for, or even bring down the whole cloud to affect all customers. This startling result stems from the cloud platform's misplaced trust, which does not match today's threats. To overcome the weakness, we propose a new system, called SOS , for hardening OpenStack. SOS limits trust on compute nodes. SOS consists of a framework that can enforce a wide range of security policies. Specifically, we applied mandatory access control and capabilities to con- fine interactions among different components. Effective confinement policies are generated automatically. Furthermore, SOS requires no modifications to the OpenStack. This has allowed us to deploy SOS on multiple versions of OpenStack. Our experimental results demonstrate that SOS is scalable, incurs negligible overheads and offers strong protection.

Martin, Jeremy, Rye, Erik, Beverly, Robert.  2016.  Decomposition of MAC Address Structure for Granular Device Inference. Proceedings of the 32Nd Annual Conference on Computer Security Applications. :78–88.

Common among the wide variety of ubiquitous networked devices in modern use is wireless 802.11 connectivity. The MAC addresses of these devices are visible to a passive adversary, thereby presenting security and privacy threats - even when link or application-layer encryption is employed. While it is well-known that the most significant three bytes of a MAC address, the OUI, coarsely identify a device's manufacturer, we seek to better understand the ways in which the remaining low-order bytes are allocated in practice. From a collection of more than two billion 802.11 frames observed in the wild, we extract device and model information details for over 285K devices, as leaked by various management frames and discovery protocols. From this rich dataset, we characterize overall device populations and densities, vendor address allocation policies and utilization, OUI sharing among manufacturers, discover unique models occurring in multiple OUIs, and map contiguous address blocks to specific devices. Our mapping thus permits fine-grained device type and model predictions for unknown devices solely on the basis of their MAC address. We validate our inferences on both ground-truth data and a third-party dataset, where we obtain high accuracy. Our results empirically demonstrate the extant structure of the low-order MAC bytes due to manufacturer's sequential allocation policies, and the security and privacy concerns therein.

Zhang, Jun, Xiao, Xiaokui, Xie, Xing.  2016.  PrivTree: A Differentially Private Algorithm for Hierarchical Decompositions. Proceedings of the 2016 International Conference on Management of Data. :155–170.

Given a set D of tuples defined on a domain Omega, we study differentially private algorithms for constructing a histogram over Omega to approximate the tuple distribution in D. Existing solutions for the problem mostly adopt a hierarchical decomposition approach, which recursively splits Omega into sub-domains and computes a noisy tuple count for each sub-domain, until all noisy counts are below a certain threshold. This approach, however, requires that we (i) impose a limit h on the recursion depth in the splitting of Omega and (ii) set the noise in each count to be proportional to h. The choice of h is a serious dilemma: a small h makes the resulting histogram too coarse-grained, while a large h leads to excessive noise in the tuple counts used in deciding whether sub-domains should be split. Furthermore, h cannot be directly tuned based on D; otherwise, the choice of h itself reveals private information and violates differential privacy. To remedy the deficiency of existing solutions, we present PrivTree, a histogram construction algorithm that adopts hierarchical decomposition but completely eliminates the dependency on a pre-defined h. The core of PrivTree is a novel mechanism that (i) exploits a new analysis on the Laplace distribution and (ii) enables us to use only a constant amount of noise in deciding whether a sub-domain should be split, without worrying about the recursion depth of splitting. We demonstrate the application of PrivTree in modelling spatial data, and show that it can be extended to handle sequence data (where the decision in sub-domain splitting is not based on tuple counts but a more sophisticated measure). Our experiments on a variety of real datasets show that PrivTree considerably outperforms the states of the art in terms of data utility.

2017-03-27
Phull, Sona, Som, Subhranil.  2016.  Symmetric Cryptography Using Multiple Access Circular Queues (MACQ). Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies. :107:1–107:6.

In order to provide secure data communication in present cyber space world, a stronger encryption technique becomes a necessity that can help people to protect their sensitive information from cryptanalyst. This paper proposes a novel symmetric block cipher algorithm that uses multiple access circular queues (MACQs) of variable lengths for diffusion of information to a greater extent. The keys are randomly generated and will be of variable lengths depending upon the size of each MACQ.A number of iterations of circular rotations, swapping of elements and XORing the key with queue elements are performed on each MACQ. S-box is used so that the relationship between the key and the cipher text remains indeterminate or obscure. These operations together will help in transforming the cipher into a much more complex and secure block cipher. This paper attempt to propose an encryption algorithm that is secure and fast.

Schwichtenberg, Simon, Engels, Gregor.  2016.  Automatized Derivation of Comprehensive Specifications for Black-box Services. Proceedings of the 38th International Conference on Software Engineering Companion. :815–818.

Today, cloud vendors host third party black-box services, whose developers usually provide only textual descriptions or purely syntactical interface specifications. Cloud vendors that give substantial support to other third party developers to integrate hosted services into new software solutions would have a unique selling feature over their competitors. However, to reliably determine if a service is reusable, comprehensive service specifications are needed. Characteristic for comprehensive in contrast to syntactical specifications are the formalization of ontological and behavioral semantics, homogeneity according to a global ontology, and a service grounding that links the abstract service description and its technical realization. Homogeneous, semantical specifications enable to reliably identify reusable services, whereas the service grounding is needed for the technical service integration. In general, comprehensive specifications are not available and have to be derived. Existing automatized approaches are restricted to certain characteristics of comprehensiveness. In my PhD, I consider an automatized approach to derive fully-fledged comprehensive specifications for black-box services. Ontological semantics are derived from syntactical interface specifications. Behavioral semantics are mined from call logs that cloud vendors create to monitor the hosted services. The specifications are harmonized over a global ontology. The service grounding is established using traceability information. The approach enables third party developers to compose services into complex systems and creates new sales channels for cloud and service providers.

Argyros, George, Stais, Ioannis, Jana, Suman, Keromytis, Angelos D., Kiayias, Aggelos.  2016.  SFADiff: Automated Evasion Attacks and Fingerprinting Using Black-box Differential Automata Learning. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. :1690–1701.

Finding differences between programs with similar functionality is an important security problem as such differences can be used for fingerprinting or creating evasion attacks against security software like Web Application Firewalls (WAFs) which are designed to detect malicious inputs to web applications. In this paper, we present SFADIFF, a black-box differential testing framework based on Symbolic Finite Automata (SFA) learning. SFADIFF can automatically find differences between a set of programs with comparable functionality. Unlike existing differential testing techniques, instead of searching for each difference individually, SFADIFF infers SFA models of the target programs using black-box queries and systematically enumerates the differences between the inferred SFA models. All differences between the inferred models are checked against the corresponding programs. Any difference between the models, that does not result in a difference between the corresponding programs, is used as a counterexample for further refinement of the inferred models. SFADIFF's model-based approach, unlike existing differential testing tools, also support fully automated root cause analysis in a domain-independent manner. We evaluate SFADIFF in three different settings for finding discrepancies between: (i) three TCP implementations, (ii) four WAFs, and (iii) HTML/JavaScript parsing implementations in WAFs and web browsers. Our results demonstrate that SFADIFF is able to identify and enumerate the differences systematically and efficiently in all these settings. We show that SFADIFF is able to find differences not only between different WAFs but also between different versions of the same WAF. SFADIFF is also able to discover three previously-unknown differences between the HTML/JavaScript parsers of two popular WAFs (PHPIDS 0.7 and Expose 2.4.0) and the corresponding parsers of Google Chrome, Firefox, Safari, and Internet Explorer. We confirm that all these differences can be used to evade the WAFs and launch successful cross-site scripting attacks.

Eberly, Wayne.  2016.  Selecting Algorithms for Black Box Matrices: Checking For Matrix Properties That Can Simplify Computations. Proceedings of the ACM on International Symposium on Symbolic and Algebraic Computation. :207–214.

Processes to automate the selection of appropriate algorithms for various matrix computations are described. In particular, processes to check for, and certify, various matrix properties of black-box matrices are presented. These include sparsity patterns and structural properties that allow "superfast" algorithms to be used in place of black-box algorithms. Matrix properties that hold generically, and allow the use of matrix preconditioning to be reduced or eliminated, can also be checked for and certified –- notably including in the small-field case, where this presently has the greatest impact on the efficiency of the computation.