Biblio
Programming languages permitting immediate memory accesses through pointers often result in applications having memory-related errors, which may lead to unpredictable failures and security vulnerabilities. A light-weight solution is presented in this paper to tackle such illegal memory accesses dynamically in C/C++ based applications. We propose a new and effective method of instrumenting an application's source code at compile time in order to detect out-of-bound memory accesses. It is based on creating tags, to be coupled with each memory allocation and then placing additional tag checking instructions for each access made to the memory. The proposed solution is evaluated by instrumenting applications from the BugBench benchmark suite and publicly available benchmark software, Runtime Intrusion Prevention Evaluator (RIPE), detecting all the bugs successfully. The performance and memory overhead is further analysed by instrumenting and executing real world applications.
Providing efficient protection against energy consumption based side channel attacks (SCAs) for block ciphers is a relevant topic for the research community, as current overheads are in the 100x range. Unprofiled SCAs exploit information leakage from the outmost rounds of a cipher; we propose a solution encasing it between keyed transformations amenable to an efficient SCA protection. Our solution can be employed as a drop in replacement for an unprotected implementation, or be retrofit to an existing one, while retaining communication capabilities with legacy insecure endpoints. Experiments on a Cortex-M4 μC, show performance improvements in the range of 60x, compared with available solutions.
Side Channel Attacks (SCA) have proven to be a practical threat to the security of embedded systems, exploiting the information leakage coming from unintended channels concerning an implementation of a cryptographic primitive. Given the large variety of embedded platforms, and the ubiquity of the need for secure cryptographic implementations, a systematic and automated approach to deploy SCA countermeasures at design time is strongly needed. In this paper, we provide an overview of recent compiler-based techniques to protect software implementations against SCA, making them amenable to automated application in the development of secure-by-design systems.
Physical attacks especially fault attacks represent one the major threats against embedded systems. In the state of the art, software countermeasures against fault attacks are either applied at the source code level where it will very likely be removed at compilation time, or at assembly level where several transformations need to be performed on the assembly code and lead to significant overheads both in terms of code size and execution time. This paper presents the use of compiler techniques to efficiently automate the application of software countermeasures against instruction-skip fault attacks. We propose a modified LLVM compiler that considers our security objectives throughout the compilation process. Experimental results illustrate the effectiveness of this approach on AES implementations running on an ARM-based microcontroller in terms of security overhead compared to existing solutions.
Heap buffer overflow (underflow) errors are a common source of security vulnerabilities. One prevention mechanism is to add object bounds meta information and to instrument the program with explicit bounds checks for all memory access. The so-called "fat pointers" approach is one method for maintaining and propagating the meta information where native machine pointers are replaced with "fat" objects that explicitly store object bounds. Another approach is "low fat pointers", which encodes meta information within a native pointer itself, eliminating space overheads and also code compatibility issues. This paper presents a new low-fat pointer encoding that is fully compatible with existing libraries (e.g. pre-compiled libraries unaware of the encoding) and standard hardware (e.g. x86\_64). We show that our approach has very low memory overhead, and competitive with existing state-of-the-art bounds instrumentation solutions.
Information-centric networking (ICN) has been actively studied as a promising alternative to the IP-based Internet architecture with potential benefits in terms of network efficiency, privacy, security, and novel applications. However, it is difficult to adopt such wholesale replacement of the IP-based Internet to a new routing and service infrastructure due to the conflict among existing stakeholders, market players, and solution providers. To overcome these difficulties, we provide an evolutionary approach by which we enable the expected benefits of ICN for existing services. The demonstration shows that these benefits can be efficiently introduced and work with existing IP end-systems.
The Information-Centric Networking (ICN) paradigm is drastically different from traditional host-centric IP networking. As a consequence of the disparity between the two, the security models are also very different. The security model for IP is based on securing the end-to-end communication link between the communicating nodes whereas the ICN security model is based on securing data objects often termed as Object Security. Just like the traditional security model, Object security also poses a challenge of key management. This is especially concerning for ICN as data cached in its encrypted form should be usable by several different users. Attribute-Based Encryption (ABE) alleviates this problem by enabling data to be encrypted under a policy that suits several different types of users. Users with different sets of attributes can potentially decrypt the data hence eliminating the need to encrypt the data separately for each type of user. ABE is a more processing intensive task compared to traditional public key encryption methods hence posing a challenge for resource constrained environments with devices that have low memory and battery power. In this demo we show ABE encryption carried out on a resource constrained sensor platform. Encrypted data is transported over an ICN network and is decrypted only by clients that have the correct set of attributes.
Content-centric networking (CCN) is a networking paradigm that emphasizes request-response-based data transfer. A \\textbackslashem consumer\ issues a request explicitly referencing desired data by name. A \\textbackslashem producer\ assigns a name to each data it publishes. Names are used both to identify data to and route traffic between consumers and producers. The type, format, and representation of names are fundamental to CCN. Currently, names are represented as human-readable application-layer URIs. This has several important security and performance implications for the network. In this paper, we propose to transparently decouple application-layer names from their network-layer counterparts. We demonstrate a mapping between the two namespaces that can be deterministically computed by consumers and producers, using application names formatted according to the standard CCN URI scheme. Meanwhile, consumers and producers can continue to use application-layer names. We detail the computation and mapping function requirements and discuss their impact on consumers, producers, and routers. Finally, we comprehensively analyze several mapping functions to show their functional equivalence to standard application names and argue that they address several issues that stem from propagating application names into the network.
In content-based security, encrypted content as well as wrapped access keys are made freely available by an Information Centric Network: Only those clients which are able to unwrap the encryption key can access the protected content. In this paper we extend this model to computation chains where derived data (e.g. produced by a Named Function Network) also has to comply to the content-based security approach. A central problem to solve is the synchronized on-demand publishing of encrypted results and wrapped keys as well as defining the set of consumers which are authorized to access the derived data. In this paper we introduce "content-attendant policies" and report on a running prototype that demonstrates how to enforce data owner-defined access control policies despite fully decentralized and arbitrarily long computation chains.
The emerging Information-Centric Networking (ICN) paradigm is expected to facilitate content sharing among users. ICN will make it easy for users to appoint storage nodes, in various network locations, perhaps owned or controlled by them, where shared content can be stored and disseminated from. These storage nodes should be (somewhat) trusted since not only they have (some level of) access to user shared content, but they should also properly enforce access control. Traditional forms of encryption introduce significant overhead when it comes to sharing content with large and dynamic groups of users. To this end, proxy re-encryption provides a convenient solution. In this paper, we use Identity-Based Proxy Re-Encryption (IB-PRE) to provide confidentiality and access control for content items shared over ICN, realizing secure content distribution among dynamic sets of users. In contrast to similar IB-PRE based solutions, our design allows each user to generate the system parameters and the secret keys required by the underlay encryption scheme using their own \textbackslashemph\Private Key Generator\, therefore, our approach does not suffer from the key escrow problem. Moreover, our design further relaxes the trust requirements on the storage nodes by preventing them from sharing usable content with unauthorized users. Finally, our scheme does not require out-of-band secret key distribution.
The shift from the host-centric to the information-centric paradigm results in many benefits including native security, enhanced mobility, and scalability. The corresponding information-centric networking (ICN), also presents several important challenges, such as closest replica routing, client privacy, and client preference collection. The majority of these challenges have received the research community’s attention. However, no mechanisms have been proposed for the challenge of effective client preferences collection. In the era of big data analytics and recommender systems customer preferences are essential for providers such as Amazon and Netflix. However, with content served from in-network caches, the ICN paradigm indirectly undermines the gathering of these essential individualized preferences. In this paper, we discuss the requirements for client preference collections and present potential mechanisms that may be used for achieving it successfully.
Code diversification is an effective mitigation against return-oriented programming attacks, which breaks the assumptions of attackers about the location and structure of useful instruction sequences, known as "gadgets". Although a wide range of code diversification techniques of varying levels of granularity exist, most of them rely on the availability of source code, debug symbols, or the assumption of fully precise code disassembly, limiting their practical applicability for the protection of closed-source third-party applications. In-place code randomization has been proposed as an alternative binary-compatible diversification technique that is tolerant of partial disassembly coverage, in the expense though of leaving some gadgets intact, at the disposal of attackers. Consequently, the possibility of constructing robust ROP payloads using only the remaining non-randomized gadgets is still open. In this paper we present instruction displacement, a code diversification technique based on static binary instrumentation that does not rely on complete code disassembly coverage. Instruction displacement aims to improve the randomization coverage and entropy of existing binary-level code diversification techniques by displacing any remaining non-randomized gadgets to random locations. The results of our experimental evaluation demonstrate that instruction displacement reduces the number of non-randomized gadgets in the extracted code regions from 15.04% for standalone in-place code randomization, to 2.77% for the combination of both techniques. At the same time, the additional indirection introduced due to displacement incurs a negligible runtime overhead of 0.36% on average for the SPEC CPU2006 benchmarks.
Code reuse attacks based on return oriented programming (ROP) are becoming more and more prevalent every year. They started as a way to circumvent operating systems protections against injected code, but they are now also used as a technique to keep the malicious code hidden from detection and analysis systems. This means that while in the past ROP chains were short and simple (and therefore did not require any dedicated tool for their analysis), we recently started to observe very complex algorithms – such as a complete rootkit – implemented entirely as a sequence of ROP gadgets. In this paper, we present a set of techniques to analyze complex code reuse attacks. First, we identify and discuss the main challenges that complicate the reverse engineer of code implemented using ROP. Second, we propose an emulation-based framework to dissect, reconstruct, and simplify ROP chains. Finally, we test our tool on the most complex example available to date: a ROP rootkit containing four separate chains, two of them dynamically generated at runtime.
Despite a long history and numerous proposed defenses, memory corruption attacks are still viable. A secure and low-overhead defense against return-oriented programming (ROP) continues to elude the security community. Currently proposed solutions still must choose between either not fully protecting critical data and relying instead on information hiding, or using incomplete, coarse-grain checking that can be circumvented by a suitably skilled attacker. In this paper, we present a light-weighted memory protection approach (LMP) that uses Intel's MPX hardware extensions to provide complete, fast ROP protection without having to rely in information hiding. We demonstrate a prototype that defeats ROP attacks while incurring an average runtime overhead of 3.9%.
Remote attestation is a crucial security service particularly relevant to increasingly popular IoT (and other embedded) devices. It allows a trusted party (verifier) to learn the state of a remote, and potentially malware-infected, device (prover). Most existing approaches are static in nature and only check whether benign software is initially loaded on the prover. However, they are vulnerable to runtime attacks that hijack the application's control or data flow, e.g., via return-oriented programming or data-oriented exploits. As a concrete step towards more comprehensive runtime remote attestation, we present the design and implementation of Control-FLow ATtestation (C-FLAT) that enables remote attestation of an application's control-flow path, without requiring the source code. We describe a full prototype implementation of C-FLAT on Raspberry Pi using its ARM TrustZone hardware security extensions. We evaluate C-FLAT's performance using a real-world embedded (cyber-physical) application, and demonstrate its efficacy against control-flow hijacking attacks.
Cache side-channel attacks have been extensively studied on x86 architectures, but much less so on ARM processors. The technical challenges to conduct side-channel attacks on ARM, presumably, stem from the poorly documented ARM cache implementations, such as cache coherence protocols and cache flush operations, and also the lack of understanding of how different cache implementations will affect side-channel attacks. This paper presents a systematic exploration of vectors for flush-reload attacks on ARM processors. flush-reload attacks are among the most well-known cache side-channel attacks on x86. It has been shown in previous work that they are capable of exfiltrating sensitive information with high fidelity. We demonstrate in this work a novel construction of flush-reload side channels on last-level caches of ARM processors, which, particularly, exploits return-oriented programming techniques to reload instructions. We also demonstrate several attacks on Android OS (e.g., detecting hardware events and tracing software execution paths) to highlight the implications of such attacks for Android devices.
Diversity has been suggested as an effective alternative to the current trend in rules-based approaches to cybersecurity. However, little work to date has focused on how various techniques generalize to new attacks. That is, there is no accepted methodology that researchers use to evaluate diversity techniques. Starting with the hypothesis that an attacker's effort increases as the common set of executable code snippets (return-oriented programming (ROP) gadgets) decreases across application variants, we explore how different diversification techniques affect the set of ROP gadgets that is available to an attacker. We show that a small population of diversified variants is sufficient to eliminate 90-99% of ROP gadgets across a collection of real-world applications. Finally, we observe that the number of remaining gadgets may still be sufficient for an attacker to mount an effective attack regardless of the presence of software diversity.
This demo dramatically illustrates how replacing 'Classic' TCP congestion control (Reno, Cubic, etc.) with a 'Scalable' alternative like Data Centre TCP (DCTCP) keeps queuing delay ultra-low; not just for a select few light applications like voice or gaming, but even when a variety of interactive applications all heavily load the same (emulated) Internet access. DCTCP has so far been confined to data centres because it is too aggressive–-it starves Classic TCP flows. To allow DCTCP to be exploited on the public Internet, we developed DualQ Coupled Active Queue Management (AQM), which allows the two TCP types to safely co-exist. Visitors can test all these claims. As well as running Web-based apps, they can pan and zoom a panoramic video of a football stadium on a touch-screen, and experience how their personalized HD scene seems to stick to their finger, even though it is encoded on the fly on servers accessed via an emulated delay, representing 'the cloud'. A pair of VR goggles can be used at the same time, making a similar point. The demo provides a dashboard so that visitors can not only experience the interactivity of each application live, but they can also quantify it via a wide range of performance stats, updated live. It also includes controls so visitors can configure different TCP variants, AQMs, network parameters and background loads and immediately test the effect.
With the world population becoming increasingly urban and the multiplication of mega cities, urban leaders have responded with plans calling for so called smart cities relying on instantaneous access to information using mobile devices for an intelligent management of resources. Coupled with the advent of the smartphone as the main platform for accessing the Internet, this has created the conditions for the looming wireless bandwidth crunch. This paper presents a content delivery infrastructure relying on off-the-shelf technology and the public transportation network (PTN) aimed at relieving the wireless bandwidth crunch in urban centers. Our solution proposes installing WiFi access points on selected public bus stations and buses and using the latter as data mules, creating a delay tolerant network capable of carrying content users can access while using the public transportation. Building such an infrastructure poses several challenges, including congestion points in major hubs and the cost of additional hardware necessary for secure communications. To address these challenges we propose a 3-Tier architecture that guarantees end-to-end delivery and minimizes hardware cost. Trace-based simulations from three major European cities of Paris, Helsinki and Toulouse demonstrate the viability of our design choices. In particular, the 3-Tier architecture is shown to guarantee end-to-end connectivity and reduce the deployment cost by several times while delivering at least as many packets as a baseline architecture.
Congestion Control (CC) algorithms are essential to quickly restore the network performance back to stable whenever congestion occurs. A majority of the existing CC algorithms are implemented at the transport layer, mostly coupled with TCP. Over the past three decades, CC algorithms have incrementally evolved, resulting in many extensions of TCP. A thorough evaluation of a new TCP extension is a huge task. Hence, the Internet Congestion Control Research Group (ICCRG) has proposed a common TCP evaluation suite that helps researchers to gain an initial insight into the working of their proposed TCP extension. This paper presents an implementation of the TCP evaluation suite in ns-3, that automates the simulation setup, topology creation, traffic generation, execution, and results collection. We also describe the internals of our implementation and demonstrate its usage for evaluating the performance of five TCP extensions available in ns-3, by automatically setting up the following simulation scenarios: (i) single and multiple bottleneck topologies, (ii) varying bottleneck bandwidth, (iii) varying bottleneck RTT and (iv) varying the number of long flows.
When multiple TCP connections are used between the same host pair, they often share a common bottleneck – especially when they are encapsulated together, e.g. in VPN scenarios. Then, all connections after the first should not have to guess the right initial value for the congestion window, but rather get the appropriate value from other connections. This allows short flows to complete much faster – but it can also lead to large bursts that cause problems on their own. Prior work used timer-based pacing methods to alleviate this problem; we introduce a new algorithm that ``paces'' packets by instead correctly maintaining the ACK clock, and show its positive impact in combination with a previously presented congestion coupling algorithm.
Differential privacy enables organizations to collect accurate aggregates over sensitive data with strong, rigorous guarantees on individuals' privacy. Previous work has found that under differential privacy, computing multiple correlated aggregates as a batch, using an appropriate strategy, may yield higher accuracy than computing each of them independently. However, finding the best strategy that maximizes result accuracy is non-trivial, as it involves solving a complex constrained optimization program that appears to be non-convex. Hence, in the past much effort has been devoted in solving this non-convex optimization program. Existing approaches include various sophisticated heuristics and expensive numerical solutions. None of them, however, guarantees to find the optimal solution of this optimization problem. This paper points out that under (ε, ཬ)-differential privacy, the optimal solution of the above constrained optimization problem in search of a suitable strategy can be found, rather surprisingly, by solving a simple and elegant convex optimization program. Then, we propose an efficient algorithm based on Newton's method, which we prove to always converge to the optimal solution with linear global convergence rate and quadratic local convergence rate. Empirical evaluations demonstrate the accuracy and efficiency of the proposed solution.
Delegating computation, which is applicable to many practical contexts such as cloud computing or pay-TV system, concerns the task where a computationally weak client wants to securely compute a very complex function f on a given input with the help of a remote computationally strong but untrusted server. The requirement is that the computation complexity of the client is much more efficient than that of f, ideally it should be in constant time or in NC0. This task has been investigated in several contexts such as instance hiding, randomized encoding, fully homomorphic encryption, garbling schemes, and verifiable scheme. In this work, we specifically consider the context where only the client has an input and gets an output, also called instance hiding. Concretely, we first give a survey of delegating computation, we then propose an efficient instance hiding scheme with passive input privacy. In our scheme, the computation complexity of the client is in NC0 and that of the server is exactly the same as the original function f. Regarding communication complexity, the client in our scheme just needs to transfer 4textbarftextbar + textbarxtextbar bits to the server, where textbarftextbar is the size of the circuit representing f and textbarxtextbar is the length of the input of f.
Adaptivity is an important feature of data analysis - the choice of questions to ask about a dataset often depends on previous interactions with the same dataset. However, statistical validity is typically studied in a nonadaptive model, where all questions are specified before the dataset is drawn. Recent work by Dwork et al. (STOC, 2015) and Hardt and Ullman (FOCS, 2014) initiated a general formal study of this problem, and gave the first upper and lower bounds on the achievable generalization error for adaptive data analysis. Specifically, suppose there is an unknown distribution P and a set of n independent samples x is drawn from P. We seek an algorithm that, given x as input, accurately answers a sequence of adaptively chosen ``queries'' about the unknown distribution P. How many samples n must we draw from the distribution, as a function of the type of queries, the number of queries, and the desired level of accuracy? In this work we make two new contributions towards resolving this question: We give upper bounds on the number of samples n that are needed to answer statistical queries. The bounds improve and simplify the work of Dwork et al. (STOC, 2015), and have been applied in subsequent work by those authors (Science, 2015; NIPS, 2015). We prove the first upper bounds on the number of samples required to answer more general families of queries. These include arbitrary low-sensitivity queries and an important class of optimization queries (alternatively, risk minimization queries). As in Dwork et al., our algorithms are based on a connection with algorithmic stability in the form of differential privacy. We extend their work by giving a quantitatively optimal, more general, and simpler proof of their main theorem that the stability notion guaranteed by differential privacy implies low generalization error. We also show that weaker stability guarantees such as bounded KL divergence and total variation distance lead to correspondingly weaker generalization guarantees.
We initiate the study of a quantity that we call coordination complexity. In a distributed optimization problem, the information defining a problem instance is distributed among n parties, who need to each choose an action, which jointly will form a solution to the optimization problem. The coordination complexity represents the minimal amount of information that a centralized coordinator, who has full knowledge of the problem instance, needs to broadcast in order to coordinate the n parties to play a nearly optimal solution. We show that upper bounds on the coordination complexity of a problem imply the existence of good jointly differentially private algorithms for solving that problem, which in turn are known to upper bound the price of anarchy in certain games with dynamically changing populations. We show several results. We fully characterize the coordination complexity for the problem of computing a many-to-one matching in a bipartite graph. Our upper bound in fact extends much more generally to the problem of solving a linearly separable convex program. We also give a different upper bound technique, which we use to bound the coordination complexity of coordinating a Nash equilibrium in a routing game, and of computing a stable matching.