Biblio
Big data is the next frontier for modernization, rivalry, and profitability. It is the foundation of all the major trends such as social networks, mobile devices, healthcare, stock markets etc. Big data is efficiently stored in the cloud because of its high-volume, high-speed and high-assortment data resources. An unauthorized user access control is the gravest threat of huge information in the cloud environment because of the remote file storage. Attribute Based Encryption (ABE) is an efficient access control procedure to guarantee end-to-end security for huge information in the cloud. Most often existing ABE working principle is based on bilinear pairing. In this paper, we construct a peculiar ABE for big data in the cloud. Our proposed scheme is based on quadratic residue and attribute union which is based on fundamental arithmetic theorem.
We present the Chained Attacks approach, an automated model-based approach to test the security of web applications that does not require a background in formal methods. Starting from a set of HTTP conversations and a configuration file providing the testing surface and purpose, a model of the System Under Test (SUT) is generated and input, along with the web attacker model we defined, to a model checker acting as test oracle. The HTTP conversations, payload libraries, and a mapping created while generating the model aid the concretization of the test cases, allowing for their execution on the SUT's implementation. We applied our approach to a real-life case study and we were able to find a combination of different attacks representing the concrete chained attack performed by a bug bounty hunter.
The relevance of identity data leaks on the Internet is more present than ever. Almost every month we read about leakage of databases with more than a million users in the news. Smaller but not less dangerous leaks happen even multiple times a day. The public availability of such leaked data is a major threat to the victims, but also creates the opportunity to learn not only about security of service providers but also the behavior of users when choosing passwords. Our goal is to analyze this data and generate knowledge that can be used to increase security awareness and security, respectively. This paper presents a novel approach to automatic analysis of a vast majority of bigger and smaller leaks. Our contribution is the concept and a prototype implementation of a parser, composed of a syntactic and a semantic module, and a data analyzer for identity leaks. In this context, we deal with the two major challenges of a huge amount of different formats and the recognition of leaks' unknown data types. Based on the data collected, this paper reveals how easy it is for criminals to collect lots of passwords, which are plain text or only weakly hashed.
The relevance of identity data leaks on the Internet is more present than ever. Almost every month we read about leakage of databases with more than a million users in the news. Smaller but not less dangerous leaks happen even multiple times a day. The public availability of such leaked data is a major threat to the victims, but also creates the opportunity to learn not only about security of service providers but also the behavior of users when choosing passwords. Our goal is to analyze this data and generate knowledge that can be used to increase security awareness and security, respectively. This paper presents a novel approach to automatic analysis of a vast majority of bigger and smaller leaks. Our contribution is the concept and a prototype implementation of a parser, composed of a syntactic and a semantic module, and a data analyzer for identity leaks. In this context, we deal with the two major challenges of a huge amount of different formats and the recognition of leaks' unknown data types. Based on the data collected, this paper reveals how easy it is for criminals to collect lots of passwords, which are plain text or only weakly hashed.
Unlike most social media, where automatic archiving of data is the default, Snapchat defaults to ephemerality: deleting content shortly after it is viewed by a receiver. Interviews with 25 Snapchat users show that ephemerality plays a key role in shaping their practices. Along with friend-adding features that facilitate a network of mostly close relations, default deletion affords everyday, mundane talk and reduces self-consciousness while encouraging playful interaction. Further, although receivers can save content through screenshots, senders are notified; this selective saving with notification supports complex information norms that preserve the feel of ephemeral communication while supporting the capture of meaningful content. This dance of giving and taking, sharing and showing, and agency for both senders and receivers provides the basis for a rich design space of mechanisms, levels, and domains for ephemerality.
Hardware-based enclave protection mechanisms, such as Intelâs SGX, ARMâs TrustZone, and Appleâs Secure Enclave, can protect code and data from powerful low-level attackers. In this work, we use enclaves to enforce strong application-specific information security policies. We present IMPE, a novel calculus that captures the essence of SGX-like enclave mechanisms, and show that a security-type system for IMPE can enforce expressive confidentiality policies (including erasure policies and delimited release policies) against powerful low-level attackers, including attackers that can arbitrarily corrupt non-enclave code, and, under some circumstances, corrupt enclave code. We present a translation from an expressive security-typed calculus (that is not aware of enclaves) to IMPE. The translation automatically places code and data into enclaves to enforce the security policies of the source program.
Presented at the NSA Science of Security Quarterly Meeting, July 2016.
Integrating security testing into the workflow of software developers not only can save resources for separate security testing but also reduce the cost of fixing security vulnerabilities by detecting them early in the development cycle. We present an automatic testing approach to detect a common type of Cross Site Scripting (XSS) vulnerability caused by improper encoding of untrusted data. We automatically extract encoding functions used in a web application to sanitize untrusted inputs and then evaluate their effectiveness by automatically generating XSS attack strings. Our evaluations show that this technique can detect 0-day XSS vulnerabilities that cannot be found by static analysis tools. We will also show that our approach can efficiently cover a common type of XSS vulnerability. This approach can be generalized to test for input validation against other types injections such as command line injection.
We propose a new voting scheme, BeleniosRF, that offers both receipt-freeness and end-to-end verifiability. It is receipt-free in a strong sense, meaning that even dishonest voters cannot prove how they voted. We provide a game-based definition of receipt-freeness for voting protocols with non-interactive ballot casting, which we name strong receipt-freeness (sRF). To our knowledge, sRF is the first game-based definition of receipt-freeness in the literature, and it has the merit of being particularly concise and simple. Built upon the Helios protocol, BeleniosRF inherits its simplicity and does not require any anti-coercion strategy from the voters. We implement BeleniosRF and show its feasibility on a number of platforms, including desktop computers and smartphones.
Blockchain is an emerging technology for decentralized and transactional data sharing across a large network of untrusted participants. It enables new forms of distributed software architectures, where components can find agreements on their shared states without trusting a central integration point or any particular participating components. Considering the blockchain as a software connector helps make explicitly important architectural considerations on the resulting performance and quality attributes (for example, security, privacy, scalability and sustainability) of the system. Based on our experience in several projects using blockchain, in this paper we provide rationales to support the architectural decision on whether to employ a decentralized blockchain as opposed to other software solutions, like traditional shared data storage. Additionally, we explore specific implications of using the blockchain as a software connector including design trade-offs regarding quality attributes.
The secure two-party computation (S2PC) protocols SHADE and GSHADE have been introduced by Bringer et al. in the last two years. The protocol GSHADE permits to compute different distances (Hamming, Euclidean, Mahalanobis) quite efficiently and is one of the most efficient compared to other S2PC methods. Thus this protocol can be used to efficiently compute one-to-many identification for several biometrics data (iris, face, fingerprint). In this paper, we introduce two extensions of GSHADE. The first one enables us to evaluate new multiplicative functions. This way, we show how to apply GSHADE to a classical machine learning algorithm. The second one is a new proposal to secure GSHADE against malicious adversaries following the recent dual execution and cut-and-choose strategies. The additional cost is very small. By preserving the GSHADE's structure, our extensions are very efficient compared to other S2PC methods.
Today's infrastructure as a service (IaaS) cloud environments rely upon full trust in the provider to secure applications and data. Cloud providers do not offer the ability to create hardware-rooted cryptographic identities for IaaS cloud resources or sufficient information to verify the integrity of systems. Trusted computing protocols and hardware like the TPM have long promised a solution to this problem. However, these technologies have not seen broad adoption because of their complexity of implementation, low performance, and lack of compatibility with virtualized environments. In this paper we introduce keylime, a scalable trusted cloud key management system. keylime provides an end-to-end solution for both bootstrapping hardware rooted cryptographic identities for IaaS nodes and for system integrity monitoring of those nodes via periodic attestation. We support these functions in both bare-metal and virtualized IaaS environments using a virtual TPM. keylime provides a clean interface that allows higher level security services like disk encryption or configuration management to leverage trusted computing without being trusted computing aware. We show that our bootstrapping protocol can derive a key in less than two seconds, we can detect system integrity violations in as little as 110ms, and that keylime can scale to thousands of IaaS cloud nodes.
We present a technique for bounded invariant verification of nonlinear networked dynamical systems with delayed interconnections. The underlying problem in precise boundedtime verification lies with computing bounds on the sensitivity of trajectories (or solutions) to changes in initial states and inputs of the system. For large networks, computing this sensitivity
with precision guarantees is challenging. We introduce the notion of input-to-state (IS) discrepancy of each module or subsystem in a larger nonlinear networked dynamical system. The IS discrepancy bounds the distance between two solutions or trajectories of a module in terms of their initial states and their inputs. Given the IS discrepancy functions of the modules, we show that it is possible to effectively construct a reduced (low dimensional) time-delayed dynamical system, such that the trajectory of this reduced model precisely bounds the distance between the trajectories of the complete network with changed initial states. Using the above results we develop a sound and relatively complete algorithm for bounded invariant verification of networked dynamical systems consisting of nonlinear modules interacting through possibly delayed signals. Finally, we introduce a local version of IS discrepancy and show that it is possible to compute them using only the Lipschitz constant and the Jacobian of the dynamic function of the modules.
Lightweight cryptography has been widely utilized in resource constrained embedded devices of Cyber-Physical System (CPS) terminals. The hostile and unattended environment in many scenarios make those endpoints easy to be attacked by hardware based techniques. As a resource-efficient countermeasure against Fault Attacks, parity Concurrent Error Detection (CED) is preferably integrated with security-critical algorithm in CPS terminals. The parity bit changes if an odd number of faults occur during the cipher execution. In this paper, we analyze the effectiveness of fault detection of a parity CED protected cipher (PRESENT) using laser fault injection. The experimental results show that the laser perturbation to encryption can easily flip an even number of data bits, where the faults cannot be detected by parity. Due to the similarity of different parity structures, our attack can bypass almost all parity protections in block ciphers. Some suggestions are given to enhance the security of parity implementations.
C programming language never performs automatic bounds checking in order to speed up execution. But bounds checking is absolutely necessary in any program. Because if a variable is out-of-bounds, some serious errors may occur during execution, such as endless loop or buffer overflows. When there are arrays used in a program, the index of an array must be within the boundary of the array. But programmers always miss the array bounds checking or do not perform a correct array bounds checking. In this paper, we perform static analysis based on taint analysis and data flow analysis to detect which arrays do not have correct array bounds checking in the program. And we implement an automatic static tool, Carraybound. And the experimental results show that Carraybound can work effectively and efficiently.
Call traces, i.e., sequences of function calls and returns, are fundamental to a wide range of program analyses such as bug reproduction, fault diagnosis, performance analysis, and many others. The conventional approach to collect call traces that instruments each function call and return site incurs large space and time overhead. Our approach aims at reducing the recording overheads by instrumenting only a small amount of call sites while keeping the capability of recovering the full trace. We propose a call trace model and a logged call trace model based on an LL(1) grammar, which enables us to define the criteria of a feasible solution to call trace collection. Based on the two models, we prove that to collect call traces with minimal instrumentation is an NP-hard problem. We then propose an efficient approach to obtaining a suboptimal solution. We implemented our approach as a tool Casper and evaluated it using the DaCapo benchmark suite. The experiment results show that our approach causes significantly lower runtime (and space) overhead than two state-of-the-arts approaches.
SQL injection is one of the major threats to the security of the web applications. Attackers try to gain unauthorized access to the database, which has vital and private information of the users. Many researchers have provided various techniques and practices to protect the web applications from attackers. There is a plethora of techniques available to perform SQL injection and usually not everyone is familiar with every attack. Hence, this kind of attack is still the most prevalent. In this paper, we have presented the types of SQL injections attacks and most dominant ways to prevent them.
Today’s Internet services are often expected to stay available and render high responsiveness even in the face of site crashes and network partitions. Theoretical results state that causal consistency is one of the strongest consistency guarantees that is possible under these requirements, and many practical systems provide causally consistent key-value stores. In this paper, we present a framework called Chapar for modular verification of causal consistency for replicated key-value store implementations and their client programs. Specifically, we formulate separate correctness conditions for key-value store implementations and for their clients. The interface between the two is a novel operational semantics for causal consistency. We have verified the causal consistency of two key-value store implementations from the literature using a novel proof technique. We have also implemented a simple automatic model checker for the correctness of client programs. The two independently verified results for the implementations and clients can be composed to conclude the correctness of any of the programs when executed with any of the implementations. We have developed and checked our framework in Coq, extracted it to OCaml, and built executable stores.
This study stems from the premise that we need to break away from the "reactive" cycle of developing defenses against new DDoS attacks (e.g., amplification) by proactively investigating the potential for new types of DDoS attacks. Our specific focus is on pulsating attacks, a particularly debilitating type that has been hypothesized in the literature. In a pulsating attack, bots coordinate to generate intermittent pulses at target links to significantly reduce the throughput of TCP connections traversing the target. With pulsating attacks, attackers can cause significantly greater damage to legitimate users than traditional link flooding attacks. To date, however, pulsating attacks have been either deemed ineffective or easily defendable for two reasons: (1) they require a central coordinator and can thus be tracked; and (2) they require tight synchronization of pulses, which is difficult even in normal non-congestion scenarios. This paper argues that, in fact, the perceived drawbacks of pulsating attacks are in fact not fundamental. We develop a practical pulsating attack called CICADAS using two key ideas: using both (1) congestion as an implicit signal for decentralized implementation, and (2) a Kalman-filter-based approach to achieve tight synchronization. We validate CICADAS using simulations and wide-area experiments. We also discuss possible countermeasures against this attack.
With the increasing incentive of enterprises to ingest as much data as they can in what is commonly referred to as "data lakes", and with the recent development of multiple technologies to support this "load-first" paradigm, the new environment presents serious data management challenges. Among them, the assessment of data quality and cleaning large volumes of heterogeneous data sources become essential tasks in unveiling the value of big data. The coveted use of unstructured and semi-structured data in large volumes makes current data cleaning tools (primarily designed for relational data) not directly adoptable. We present CLAMS, a system to discover and enforce expressive integrity constraints from large amounts of lake data with very limited schema information (e.g., represented as RDF triples). This demonstration shows how CLAMS is able to discover the constraints and the schemas they are defined on simultaneously. CLAMS also introduces a scale-out solution to efficiently detect errors in the raw data. CLAMS interacts with human experts to both validate the discovered constraints and to suggest data repairs. CLAMS has been deployed in a real large-scale enterprise data lake and was experimented with a real data set of 1.2 billion triples. It has been able to spot multiple obscure data inconsistencies and errors early in the data processing stack, providing huge value to the enterprise.
The function of query auto-completion in modern search engines is to help users formulate queries fast and precisely. Conventional context-aware methods primarily rank candidate queries according to term- and query- relationships to the context. However, most sessions are extremely short. How to capture search intents with such relationships becomes difficult when the context generally contains only few queries. In this paper, we investigate the feasibility of discovering search intents within short context for query auto-completion. The class distribution of the search session (i.e., issued queries and click behavior) is derived as search intents. Several distribution-based features are proposed to estimate the proximity between candidates and search intents. Finally, we apply learning-to-rank to predict the user's intended query according to these features. Moreover, we also design an ensemble model to combine the benefits of our proposed features and term-based conventional approaches. Extensive experiments have been conducted on the publicly available AOL search engine log. The experimental results demonstrate that our approach significantly outperforms six competitive baselines. The performance of keystrokes is also evaluated in experiments. Furthermore, an in-depth analysis is made to justify the usability of search intent classification for query auto-completion.
We present a first of its kind framework which overcomes a major challenge in the design of digital systems that are resilient to reliability failures: achieve desired resilience targets at minimal costs (energy, power, execution time, area) by combining resilience techniques across various layers of the system stack (circuit, logic, architecture, software, algorithm). This is also referred to as cross-layer resilience. In this paper, we focus on radiation-induced soft errors in processor cores. We address both single-event upsets (SEUs) and single-event multiple upsets (SEMUs) in terrestrial environments. Our framework automatically and systematically explores the large space of comprehensive resilience techniques and their combinations across various layers of the system stack (798 cross-layer combinations in this paper), derives cost-effective solutions that achieve resilience targets at minimal costs, and provides guidelines for the design of new resilience techniques. We demonstrate the practicality and effectiveness of our framework using two diverse designs: a simple, in-order processor core and a complex, out-of-order processor core. Our results demonstrate that a carefully optimized combination of circuit-level hardening, logic-level parity checking, and micro-architectural recovery provides a highly cost-effective soft error resilience solution for general-purpose processor cores. For example, a 50× improvement in silent data corruption rate is achieved at only 2.1% energy cost for an out-of-order core (6.1% for an in-order core) with no speed impact. However, selective circuit-level hardening alone, guided by a thorough analysis of the effects of soft errors on application benchmarks, provides a cost-effective soft error resilience solution as well (with \textasciitilde1% additional energy cost for a 50× improvement in silent data corruption rate).