Biblio
In this paper, we introduce the use of adaptive controllers into software-defined networking (SDN) and propose the use of adaptive consistency models in the context of distributed SDN controllers. These adaptive controllers can tune their own configurations in real-time in order to enhance the performance of the applications running on top of them. We expect that the use of such controllers could alleviate some of the emerging challenges in SDN that could have an impact on the performance, security, or scalability of the network. Further, we propose extending the SDN controller architecture to support adaptive consistency based on tunable consistency models. Finally, we compare the performance of a proof-of-concept distributed load-balancing application when it runs on-top of: (1) an adaptive and (2) a non-adaptive controller. Our results indicate that adaptive controllers were more resilient to sudden changes in the network conditions than the non-adaptive ones.
Today, outsourcing query processing tasks to remote cloud servers becomes a viable option; such outsourcing calls for encrypting data stored at the server so as to render it secure against eavesdropping adversaries and/or an honest-but-curious server itself. At the same time, to be efficiently managed, outsourced data should be indexed, and even adaptively so, as a side-effect of query processing. Computationally heavy encryption schemes render such outsourcing unattractive; an alternative, Order-Preserving Encryption Scheme (OPES), intentionally preserves and reveals the order in the data, hence is unattractive from the security viewpoint. In this paper, we propose and analyze a scheme for lightweight and indexable encryption, based on linear-algebra operations. Our scheme provides higher security than OPES and allows for range and point queries to be efficiently evaluated over encrypted numeric data, with decryption performed at the client side. We implement a prototype that performs incremental, query-triggered adaptive indexing over encrypted numeric data based on this scheme, without leaking order information in advance, and without prohibitive overhead, as our extensive experimental study demonstrates.
By maintaining the data in main memory, in-memory databases dramatically reduce the I/O cost of transaction processing. However, for recovery purposes, in-memory systems still need to flush the log to disk, which incurs a substantial number of I/Os. Recently, command logging has been proposed to replace the traditional data log (e.g., ARIES logging) in in-memory databases. Instead of recording how the tuples are updated, command logging only tracks the transactions that are being executed, thereby effectively reducing the size of the log and improving the performance. However, when a failure occurs, all the transactions in the log after the last checkpoint must be redone sequentially and this significantly increases the cost of recovery. In this paper, we first extend the command logging technique to a distributed system, where all the nodes can perform their recovery in parallel. We show that in a distributed system, the only bottleneck of recovery caused by command logging is the synchronization process that attempts to resolve the data dependency among the transactions. We then propose an adaptive logging approach by combining data logging and command logging. The percentage of data logging versus command logging becomes a tuning knob between the performance of transaction processing and recovery to meet different OLTP requirements, and a model is proposed to guide such tuning. Our experimental study compares the performance of our proposed adaptive logging, ARIES-style data logging and command logging on top of H-Store. The results show that adaptive logging can achieve a 10x boost for recovery and a transaction throughput that is comparable to that of command logging.
Security is a well-known critical issue and exploitation of vulnerabilities is increasing in number, sophistication and damage. Furthermore, legacy systems tend to offer difficulty when upgrades are needed, specially when security recommendations are proposed. This paper presents a strategy for legacy systems based on three disciplines which guide the adoption of secure routines while avoid production drop. We present a prototype framework and discuss its success in providing security to the network of a power plant.
The reconfiguration of FPGAs includes downloading the bit-stream file which contains the new design on the FPGA. The option to reconfigure FPGAs dynamically opens up the threat of stealing the Intellectual Property (IP) of the design. Since the configuration is usually stored in external memory, this can be easily tapped and read out by an eaves-dropper. This work presents a low cost solution in order to secure the reconfiguration of FPGAs. The proposed solution is based on an efficient-compact hardware implementation for AEGIS which is considered one of the candidates to the competition of CAESAR. The proposed architecture depends on using 1/4 AES-round for reducing the consumed area. We evaluated the presented design using 90 and 65 nm technologies. Our comparison to existing AES-based schemes reveals that the proposed design is better in terms of the hardware performance (Thr./mm2).
Suppose that you have n truly random bits x=(x1,…,xn) and you wish to use them to generate m≫ n pseudorandom bits y=(y1,…, ym) using a local mapping, i.e., each yi should depend on at most d=O(1) bits of x. In the polynomial regime of m=ns, stextgreater1, the only known solution, originates from (Goldreich, ECCC 2000), is based on Random Local Functions: Compute yi by applying some fixed (public) d-ary predicate P to a random (public) tuple of distinct inputs (xi1,…,xid). Our goal in this paper is to understand, for any value of s, how the pseudorandomness of the resulting sequence depends on the choice of the underlying predicate. We derive the following results: (1) We show that pseudorandomness against F2-linear adversaries (i.e., the distribution y has low-bias) is achieved if the predicate is (a) k=Ω(s)-resilience, i.e., uncorrelated with any k-subset of its inputs, and (b) has algebraic degree of Ω(s) even after fixing Ω(s) of its inputs. We also show that these requirements are necessary, and so they form a tight characterization (up to constants) of security against linear attacks. Our positive result shows that a d-local low-bias generator can have output length of nΩ(d), answering an open question of Mossel, Shpilka and Trevisan (FOCS, 2003). Our negative result shows that a candidate for pseudorandom generator proposed by the first author (computational complexity, 2015) and by O’Donnell and Witmer (CCC 2014) is insecure. We use similar techniques to refute a conjecture of Feldman, Perkins and Vempala (STOC 2015) regarding the hardness of planted constraint satisfaction problems. (2) Motivated by the cryptanalysis literature, we consider security against algebraic attacks. We provide the first theoretical treatment of such attacks by formalizing a general notion of algebraic inversion and distinguishing attacks based on the Polynomial Calculus proof system. We show that algebraic attacks succeed if and only if there exist a degree e=O(s) non-zero polynomial Q whose roots cover the roots of P or cover the roots of P’s complement. As a corollary, we obtain the first example of a predicate P for which the generated sequence y passes all linear tests but fails to pass some polynomial-time computable test, answering an open question posed by the first author (Question 4.9, computational complexity 2015).
Imposters gain unauthorized access to biometric recognition systems using fake biometric data of the legitimate user termed as spoofing. Spoofing of face recognition systems is done by photographs, 3D models and videos of the user. Attack video contains noise from the acquisition process. In this work, we use noise residual content of the video in order to detect spoofed videos. We take advantage of wavelet transform for representing the noise video. Samples of the noise video, termed as visual rhythm image is created for each video. Local Binary Pattern (LBP) and uniform Local Binary Pattern (LBPu2) are extracted from the visual rhythm image followed by classification using Support Vector Machine (SVM). Large size of video from which a number of frames are used for analysis results in huge execution timing. In this work the spoof detection algorithm is applied on various levels of subsections of the video frames resulting in reduced execution timing with reasonable detection accuracies.
Software development and web applications have become fundamental in our lives. Millions of users access these applications to communicate, obtain information and perform transactions. However, these users are exposed to many risks; commonly due to the developer's lack of experience in security protocols. Although there are many researches about web security and hacking protection, there are plenty of vulnerable websites. This article focuses in analyzing 3 main hacking techniques: XSS, CSRF, and SQL Injection over a representative group of Colombian websites. Our goal is to obtain information about how Colombian companies and organizations give (or not) relevance to security; and how the final user could be affected.
Android malware growth has been increasing dramatically as well as the diversity and complicity of their developing techniques. Machine learning techniques have been applied to detect malware by modeling patterns of static features and dynamic behaviors of malware. The accuracy rates of the machine learning classifiers differ depending on the quality of the features. We increase the quality of the features by relating between the apps' features and the features that are required to deliver its category's functionality. To measure the benign app references, the features of the top rated apps in a specific category are utilized to train a malware detection classifier for that given category. Android apps stores such as Google Play organize apps into different categories. Each category has its distinct functionalities which means the apps under a specific category are similar in their static and dynamic features. In other words, benign apps under a certain category tend to share a common set of features. On the contrary, malicious apps tend to have abnormal features, which are uncommon for the category that they belong to. This paper proposes category-based machine learning classifiers to enhance the performance of classification models at detecting malicious apps under a certain category. The intensive machine learning experiments proved that category-based classifiers report a remarkable higher average performance compared to non-category based.
Privacy enhancing technologies (PETs) are ubiquitous nowadays. They are beneficial for a wide range of users. However, PETs are not always used for legal activity. The present paper is focused on Tor users deanonimization1 using out-of-the box technologies and a basic machine learning algorithm. The aim of the work is to show that it is possible to deanonimize a small fraction of users without having a lot of resources and state-of-the-art machine learning techniques. The deanonimization is a very important task from the point of view of national security. To address this issue, we are using a website fingerprinting attack.
We present a demonstration of the ARIA framework, a modular approach for rapid development of virtual humans for information retrieval that have linguistic, emotional, and social skills and a strong personality. We demonstrate the framework's capabilities in a scenario where `Alice in Wonderland', a popular English literature book, is embodied by a virtual human representing Alice. The user can engage in an information exchange dialogue, where Alice acts as the expert on the book, and the user as an interested novice. Besides speech recognition, sophisticated audio-visual behaviour analysis is used to inform the core agent dialogue module about the user's state and intentions, so that it can go beyond simple chat-bot dialogue. The behaviour generation module features a unique new capability of being able to deal gracefully with interruptions of the agent.
Given the proliferation of digital assistants in everyday mobile technology, it appears inevitable that next generation vehicles will be embodied by similar agents, offering engaging, natural language interactions. However, speech can be cognitively captivating. It is therefore important to understand the demand that such interfaces may place on drivers. Twenty-five participants undertook four drives (counterbalanced), in a medium-fidelity driving simulator: 1. Interacting with a state-of-the-art digital driving assistant ('DDA') (presented using Wizard-of-Oz); 2. Engaged in a hands-free mobile phone conversation; 3. Undertaking the delayed-digit recall ('2-back') task and 4. With no secondary task (baseline). Physiological arousal, subjective workload assessment, tactile detection task (TDT) and driving performance measures consistently revealed the '2-back' drive as the most cognitively demanding (highest workload, poorest TDT performance). Mobile phone and DDA conditions were largely equivalent, attracting low/medium cognitive workload. Findings are discussed in the context of designing in-vehicle natural language interfaces to mitigate cognitive demand.
Traditional authentication techniques such as static passwords are vulnerable to replay and guessing attacks. Recently, many studies have been conducted on keystroke dynamics as a promising behavioral biometrics for strengthening user authentication, however, current keystroke based solutions suffer from a numerous number of features with an insufficient number of samples which lead to a high verification error rate and high verification time. In this paper, a keystroke dynamics based authentication system is proposed for cloud environments that supports fixed and free text samples. The proposed system utilizes the ReliefF dimensionality reduction method, as a preprocessing step, to minimize the feature space dimensionality. The proposed system applies clustering to users' profile templates to reduce the verification time. The proposed system is applied to two different benchmark datasets. Experimental results prove the effectiveness and efficiency of the proposed system.
Side Channel Attacks (SCA) have proven to be a practical threat to the security of embedded systems, exploiting the information leakage coming from unintended channels concerning an implementation of a cryptographic primitive. Given the large variety of embedded platforms, and the ubiquity of the need for secure cryptographic implementations, a systematic and automated approach to deploy SCA countermeasures at design time is strongly needed. In this paper, we provide an overview of recent compiler-based techniques to protect software implementations against SCA, making them amenable to automated application in the development of secure-by-design systems.
When running large human computation tasks in the real-world, honeypots play an important role for assessing the overall quality of the work produced. The generation of such honeypots can be a significant burden on the task owner as they require specific characteristics in their design and implementation and continuous maintenance when operating data pipelines that include a human computation component. In this extended abstract we outline a novel approach for creating honeypots using automatically generated questions from a reference knowledge base with the ability to control such parameters as topic and difficulty.
Web application security has become crucially vital these days. Earlier "default allow" model was used to secure web applications but it was unable to secure web applications against plethora of attacks [1]. In contrast, more restricted security to the web applications is provided by default deny model which at first, builds a model for the particular application and then permits merely those requests that conform to that model while ignoring everything else. Besides this, a novel and effective methodology is followed that allows to analyze the validity of application requests and further results in the generation of semi structured XML cases for the web applications. Furthermore, mature and resilient XML cases are generated by employing learning techniques. This system will further be gauged by examining that XML file containing cases are in correct accordance with the XML format or not. Moreover, the distinction between malicious and non-malicious traffic is carried out carefully. Results have proved its efficacy of rule generation employing access traffic log of cross site scripting (XSS), SQL injection, HTTP Request Splitting, HTTP response splitting and Buffer overflow attacks.
Information Fusion (IF) systems have long exploited data provided by hard (physics-based) sensors with the aspiration of making sense of the environment they are monitoring. In recent times, the IF community has recognized the potential of utilizing data generated by people, also known as soft data. In this study, we demonstrate how course of action (CoA) generation, one of the key elements of Level 3 High-Level Information Fusion and a vital component for security and defense decision support systems, can be augmented using soft (human-derived) data for improved mission effectiveness. This conceptualization is validated through an elaborate experiment situated in the maritime world. To the best of the authors' knowledge, this is the first study to apply soft data to automatic CoA generation in the maritime domain.
Self-describing key-value data formats such as JSON are becoming increasingly popular as application developers choose to avoid the rigidity imposed by the relational model. Database systems designed for these self-describing formats, such as MongoDB, encourage users to use denormalized, heavily nested data models so that relationships across records and other schema information need not be predefined or standardized. Such data models contribute to long-term development complexity, as their lack of explicit entity and relationship tracking burdens new developers unfamiliar with the dataset. Furthermore, the large amount of data repetition present in such data layouts can introduce update anomalies and poor scan performance, which reduce both the quality and performance of analytics over the data. In this paper we present an algorithm that automatically transforms the denormalized, nested data commonly found in NoSQL systems into traditional relational data that can be stored in a standard RDBMS. This process includes a schema generation algorithm that discovers relationships across the attributes of the denormalized datasets in order to organize those attributes into relational tables. It further includes a matching algorithm that discovers sets of attributes that represent overlapping entities and merges those sets together. These algorithms reduce data repetition, allow the use of data analysis tools targeted at relational data, accelerate scan-intensive algorithms over the data, and help users gain a semantic understanding of complex, nested datasets.
Presented at the NSA Science of Security Quarterly Meeting, July 2016.
General sparse matrix-matrix multiplication (SpGEMM) is a core component of many algorithms. A number of recent works have used high throughput graphics processing units (GPUs) to accelerate SpGEMM. However, exploiting the power of GPUs for SpGEMM requires addressing a number of challenges, including highly imbalanced workloads and large numbers of inefficient random global memory accesses. This paper presents a SpGEMM algorithm which uses several novel techniques to overcome these problems. We first propose two low cost methods to achieve perfect load balancing during the most expensive step in SpGEMM. Next, we show how to eliminate nearly all random global memory accesses using shared memory based hash tables. To optimize the performance of the hash tables, we propose a lightweight method to estimate the number of nonzeros in the output matrix. We compared our algorithm to the CUSP, CUSPARSE and the state-of-the-art BHSPARSE GPU SpGEMM algorithms, and show that it performs 5.6x, 2.4x and 1.5x better on average, and up to 11.8x, 9.5x and 2.5x better in the best case, respectively. Furthermore, we show that our algorithm performs especially well on highly imbalanced and unstructured matrices.
A major challenge of the existing attack detection approaches is the identification of relevant information to a particular situation, and the use of such information to perform multi-evidence intrusion detection. Addressing such a limitation requires integrating several aspects of context to better predict, avoid and respond to impending attacks. The quality and adequacy of contextual information is important to decrease uncertainty and correctly identify potential cyber-attacks. In this paper, a systematic methodology has been used to identify contextual dimensions that improve the effectiveness of detecting cyber-attacks. This methodology combines graph, probability, and information theories to create several context-based attack prediction models that analyze data at a high- and low-level. An extensive validation of our approach has been performed using a prototype system and several benchmark intrusion detection datasets yielding very promising results.
The ability to identify mobile apps in network traffic has significant implications in many domains, including traffic management, malware detection, and maintaining user privacy. App identification methods in the literature typically use deep packet inspection (DPI) and analyze HTTP headers to extract app fingerprints. However, these methods cannot be used if HTTP traffic is encrypted. We investigate whether Android apps can be identified from their launch-time network traffic using only TCP/IP headers. We first capture network traffic of 86,109 app launches by repeatedly running 1,595 apps on 4 distinct Android devices. We then use supervised learning methods used previously in the web page identification literature, to identify the apps that generated the traffic. We find that: (i) popular Android apps can be identified with 88% accuracy, by using the packet sizes of the first 64 packets they generate, when the learning methods are trained and tested on the data collected from same device; (ii) when the data from an unseen device (but similar operating system/vendor) is used for testing, the apps can be identified with 67% accuracy; (iii) the app identification accuracy does not drop significantly even if the training data are stale by several days, and (iv) the accuracy does drop quite significantly if the operating system/vendor is very different. We discuss the implications of our findings as well as open issues.
Remote attestation is a crucial security service particularly relevant to increasingly popular IoT (and other embedded) devices. It allows a trusted party (verifier) to learn the state of a remote, and potentially malware-infected, device (prover). Most existing approaches are static in nature and only check whether benign software is initially loaded on the prover. However, they are vulnerable to runtime attacks that hijack the application's control or data flow, e.g., via return-oriented programming or data-oriented exploits. As a concrete step towards more comprehensive runtime remote attestation, we present the design and implementation of Control-FLow ATtestation (C-FLAT) that enables remote attestation of an application's control-flow path, without requiring the source code. We describe a full prototype implementation of C-FLAT on Raspberry Pi using its ARM TrustZone hardware security extensions. We evaluate C-FLAT's performance using a real-world embedded (cyber-physical) application, and demonstrate its efficacy against control-flow hijacking attacks.



