Biblio
Today, mobile data owners lack consent and control over the release and utilization of their location data. Third party applications continuously process and access location data without data owners granular control and without knowledge of how location data is being used. The proliferation of GPS enabled IoT devices will lead to larger scale abuses of trust. In this paper we present the first design and implementation of a privacy module built into the GPSD daemon. The GPSD daemon is a low-level GPS interface that runs on GPS enabled devices. The integration of the privacy module ensures that data owners have granular control over the release of their GPS location. We describe the design of our privacy module integration into the GPSD daemon.
This paper introduces quarantine, a new security primitive for an operating system to use in order to protect information and isolate malicious behavior. Quarantine's core feature is the ability to fork a protection domain on-the-fly to isolate a specific principal's execution of untrusted code without risk of a compromise spreading. Forking enables the OS to ensure service continuity by permitting even high-risk operations to proceed, albeit subject to greater scrutiny and constraints. Quarantine even partitions executing threads that share resources into isolated protection domains. We discuss the design and implementation of quarantine within the LockDown OS, a security-focused evolution of the Composite component-based microkernel OS. Initial performance results for quarantine show that about 98% of the overhead comes from the cost of copying memory to the new protection domain.
Augmented reality is poised to become a dominant computing paradigm over the next decade. With promises of three-dimensional graphics and interactive interfaces, augmented reality experiences will rival the very best science fiction novels. This breakthrough also brings in unique challenges on how users can authenticate one another to share rich content between augmented reality headsets. Traditional authentication protocols fall short when there is no common central entity or when access to the central authentication server is not available or desirable. Looks Good To Me (LGTM) is an authentication protocol that leverages the unique hardware and context provided with augmented reality headsets to bring innate human trust mechanisms into the digital world to solve authentication in a usable and secure way. LGTM works over point to point wireless communication so users can authenticate one another in a variety of circumstances and is designed with usability at its core, requiring users to perform only two actions: one to initiate and one to confirm. Users intuitively authenticate one another, using seemingly only each other's faces, but under the hood LGTM uses a combination of facial recognition and wireless localization to bootstrap trust from a wireless signal, to a location, to a face, for secure and usable authentication.
This work studies applications and generalizations of a simple estimation technique that provides exponential concentration under heavy-tailed distributions, assuming only bounded low-order moments. We show that the technique can be used for approximate minimization of smooth and strongly convex losses, and specifically for least squares linear regression. For instance, our d-dimensional estimator requires just O(d log(1/δ)) random samples to obtain a constant factor approximation to the optimal least squares loss with probability 1-δ, without requiring the covariates or noise to be bounded or subgaussian. We provide further applications to sparse linear regression and low-rank covariance matrix estimation with similar allowances on the noise and covariate distributions. The core technique is a generalization of the median-of-means estimator to arbitrary metric spaces.
As it becomes increasingly common for transaction processing systems to operate on datasets that fit within the main memory of a single machine or a cluster of commodity machines, traditional mechanisms for guaranteeing transaction durability–-which typically involve synchronous log flushes–-incur increasingly unappealing costs to otherwise lightweight transactions. Many applications have turned to periodically checkpointing full database state. However, existing checkpointing methods–-even those which avoid freezing the storage layer–-often come with significant costs to operation throughput, end-to-end latency, and total memory usage. This paper presents Checkpointing Asynchronously using Logical Consistency (CALC), a lightweight, asynchronous technique for capturing database snapshots that does not require a physical point of consistency to create a checkpoint, and avoids conspicuous latency spikes incurred by other database snapshotting schemes. Our experiments show that CALC can capture frequent checkpoints across a variety of transactional workloads with extremely small cost to transactional throughput and low additional memory usage compared to other state-of-the-art checkpointing systems.
Wearable personal health monitoring systems can offer a cost effective solution for human healthcare. These systems must provide both highly accurate, secured and quick processing and delivery of vast amount of data. In addition, wearable biomedical devices are used in inpatient, outpatient, and at home e-Patient care that must constantly monitor the patient's biomedical and physiological signals 24/7. These biomedical applications require sampling and processing multiple streams of physiological signals with strict power and area footprint. The processing typically consists of feature extraction, data fusion, and classification stages that require a large number of digital signal processing and machine learning kernels. In response to these requirements, in this paper, a low-power, domain-specific many-core accelerator named Power Efficient Nano Clusters (PENC) is proposed to map and execute the kernels of these applications. Experimental results show that the manycore is able to reduce energy consumption by up to 80% and 14% for DSP and machine learning kernels, respectively, when optimally parallelized. The performance of the proposed PENC manycore when acting as a coprocessor to an Intel Atom processor is compared with existing commercial off-the-shelf embedded processing platforms including Intel Atom, Xilinx Artix-7 FPGA, and NVIDIA TK1 ARM-A15 with GPU SoC. The results show that the PENC manycore architecture reduces the energy by as much as 10X while outperforming all off-the-shelf embedded processing platforms across all studied machine learning classifiers.
The popularity of cloud hosting services also brings in new security challenges: it has been reported that these services are increasingly utilized by miscreants for their malicious online activities. Mitigating this emerging threat, posed by such "bad repositories" (simply Bar), is challenging due to the different hosting strategy to traditional hosting service, the lack of direct observations of the repositories by those outside the cloud, the reluctance of the cloud provider to scan its customers' repositories without their consent, and the unique evasion strategies employed by the adversary. In this paper, we took the first step toward understanding and detecting this emerging threat. Using a small set of "seeds" (i.e., confirmed Bars), we identified a set of collective features from the websites they serve (e.g., attempts to hide Bars), which uniquely characterize the Bars. These features were utilized to build a scanner that detected over 600 Bars on leading cloud platforms like Amazon, Google, and 150K sites, including popular ones like groupon.com, using them. Highlights of our study include the pivotal roles played by these repositories on malicious infrastructures and other important discoveries include how the adversary exploited legitimate cloud repositories and why the adversary uses Bars in the first place that has never been reported. These findings bring such malicious services to the spotlight and contribute to a better understanding and ultimately eliminating this new threat.
In the last decade, the number of available cores increased and heterogeneity grew. In this work, we ask the question whether the design of the current operating systems (OSes) is still appropriate if these trends continue and lead to abundantly available but heterogeneous cores, or whether it forces a fundamental rethinking of how systems are designed. We argue that: 1. hiding heterogeneity behind a common hardware interface unifies, to a large extent, the control and coordination of cores and accelerators in the OS, 2. isolating at the network-on-chip rather than with processor features (like privileged mode, memory management unit, ...), allows running untrusted code on arbitrary cores, and 3. providing OS services via protocols over the network-on-chip, instead of via system calls, makes them accessible to arbitrary types of cores as well. In summary, this turns accelerators into first-class citizens and enables a single and convenient programming environment for all cores without the need to trust any application. In this paper, we introduce network-on-chip-level isolation, present the design of our microkernel-based OS, M3, and the common hardware interface, and evaluate the performance of our prototype in comparison to Linux. A bit surprising, without using accelerators, M3 outperforms Linux in some application-level benchmarks by more than a factor of five.
Nowadays, sentiment analysis methods become more and more popular especially with the proliferation of social media platform users number. In the same context, this paper presents a sentiment analysis approach which can faithfully translate the sentimental orientation of Arabic Twitter posts, based on a novel data representation and machine learning techniques. The proposed approach applied a wide range of features: lexical, surface-form, syntactic, etc. We also made use of lexicon features inferred from two Arabic sentiment words lexicons. To build our supervised sentiment analysis system, we use several standard classification methods (Support Vector Machines, K-Nearest Neighbour, Naïve Bayes, Decision Trees, Random Forest) known by their effectiveness over such classification issues. In our study, Support Vector Machines classifier outperforms other supervised algorithms in Arabic Twitter sentiment analysis. Via an ablation experiments, we show the positive impact of lexicon based features on providing higher prediction performance.
We introduce a machine learning approach for distinguishing between integrated circuits fabricated in a ratified facility and circuits originating from an unknown or undesired source based on parametric measurements. Unlike earlier approaches, which seek to achieve the same objective in a general, design-independent manner, the proposed method leverages the interaction between the idiosyncrasies of the fabrication facility and a specific design, in order to create a customized fab-of-origin membership test for the circuit in question. Effectiveness of the proposed method is demonstrated using two large industrial datasets from a 65nm Texas Instruments RF transceiver manufactured in two different fabrication facilities.
With the growth of internet world has transformed into a global market with all monetary and business exercises being carried online. Being the most imperative resource of the developing scene, it is the vulnerable object and hence needs to be secured from the users with dangerous personality set. Since the Internet does not have focal surveillance component, assailants once in a while, utilizing varied and advancing hacking topologies discover a path to bypass framework's security and one such collection of assaults is Intrusion. An intrusion is a movement of breaking into the framework by compromising the security arrangements of the framework set up. The technique of looking at the system information for the conceivable intrusions is known intrusion detection. For the last two decades, automatic intrusion detection system has been an important exploration point. Till now researchers have developed Intrusion Detection Systems (IDS) with the capability of detecting attacks in several available environments; latest on the scene are Machine Learning approaches. Machine learning techniques are the set of evolving algorithms that learn with experience, have improved performance in the situations they have already encountered and also enjoy a broad range of applications in speech recognition, pattern detection, outlier analysis etc. There are a number of machine learning techniques developed for different applications and there is no universal technique that can work equally well on all datasets. In this work, we evaluate all the machine learning algorithms provided by Weka against the standard data set for intrusion detection i.e. KddCupp99. Different measurements contemplated are False Positive Rate, precision, ROC, True Positive Rate.
This year, software development teams around the world are consuming BILLIONS of open source and third-party components. The good news: they are accelerating time to market. The bad news: 1 in 17 components they are using include known security vulnerabilities. In this talk, I will describe what Sonatype, the company behind The Central Repository that supports Apache Maven, has learned from analyzing how thousands of applications use open source components. I will also discuss how organizations like Mayo Clinic, Exxon, Capital One, the U.S. FDA and Intuit are utilizing the principles of software supply chain automation to improve application security and how organizations can balance the need for speed with quality and security early in the development cycle.
Traditional vibration inspection systems, equipped with separated sensing and communication modules, are either very expensive (e.g., hundreds of dollars) and/or suffer from occlusion and narrow field of view (e.g., laser). In this work, we present an RFID-based solution, Tagbeat, to inspect mechanical vibration using COTS RFID tags and readers. Making sense of micro and high-frequency vibration using random and low-frequency readings of tag has been a daunting task, especially challenging for achieving sub-millisecond period accuracy. Our system achieves these three goals by discerning the change pattern of backscatter signal replied from the tag, which is attached on the vibrating surface and displaced by the vibration within a small range. This work introduces three main innovations. First, it shows how one can utilize COTS RFID to sense mechanical vibration and accurately discover its period with a few periods of short and noisy samples. Second, a new digital microscope is designed to amplify the micro-vibration-induced weak signals. Third, Tagbeat introduces compressive reading to inspect high-frequency vibration with relatively low RFID read rate. We implement Tagbeat using a COTS RFID device and evaluate it with a commercial centrifugal machine. Empirical benchmarks with a prototype show that Tagbeat can inspect the vibration period with a mean accuracy of 0.36ms and a relative error rate of 0.03%. We also study three cases to demonstrate how to associate our inspection solution with the specific domain requirements.
Cryptocurrencies record transactions in a decentralized data structure called a blockchain. Two of the most popular cryptocurrencies, Bitcoin and Ethereum, support the feature to encode rules or scripts for processing transactions. This feature has evolved to give practical shape to the ideas of smart contracts, or full-fledged programs that are run on blockchains. Recently, Ethereum's smart contract system has seen steady adoption, supporting tens of thousands of contracts, holding millions dollars worth of virtual coins. In this paper, we investigate the security of running smart contracts based on Ethereum in an open distributed network like those of cryptocurrencies. We introduce several new security problems in which an adversary can manipulate smart contract execution to gain profit. These bugs suggest subtle gaps in the understanding of the distributed semantics of the underlying platform. As a refinement, we propose ways to enhance the operational semantics of Ethereum to make contracts less vulnerable. For developers writing contracts for the existing Ethereum system, we build a symbolic execution tool called Oyente to find potential security bugs. Among 19, 336 existing Ethereum contracts, Oyente flags 8, 833 of them as vulnerable, including the TheDAO bug which led to a 60 million US dollar loss in June 2016. We also discuss the severity of other attacks for several case studies which have source code available and confirm the attacks (which target only our accounts) in the main Ethereum network.
We present a novel type of Trojan trigger targeted at the field-programmable gate array (FPGA) design flow. Traditional triggers base on rare events, such as rare values or sequences. While in most cases these trigger circuits are able to hide a Trojan attack, exhaustive functional simulation and testing will reveal the Trojan due to violation of the specification. Our trigger behaves functionally and formally equivalent to the hardware description language (HDL) specification throughout the entire FPGA design flow, until the design is written by the place-and-route tool as bitstream configuration file . From then, Trojan payload is always on. We implement the trigger signal using a 4-input lookup table (LUT), each of the inputs connecting to the same signal. This lets us directly address the least significant bit (LSB) and most significant bit (MSB) of the LUT. With the remaining 14 bits, we realize a "magic" unary operation. This way, we are able to implement 16 different Triggers. We demonstrate the attack with a simple example and discuss the effectiveness of the recent detection techniques unused circuit identification (UCI), functional analysis for nearly-unused circuit identification (FANCI) and VeriTrust in order to reveal our trigger.
In this paper, we compare the effectiveness of Hidden Markov Models (HMMs) with that of Profile Hidden Markov Models (PHMMs), where both are trained on sequences of API calls. We compare our results to static analysis using HMMs trained on sequences of opcodes, and show that dynamic analysis achieves significantly stronger results in many cases. Furthermore, in comparing our two dynamic analysis approaches, we find that using PHMMs consistently outperforms our technique based on HMMs.
Control theory and SDN (Software Defined Networking) are key components for NFV (Network Function Virtualization) deployment. However little has been done to use a control-theoretic approach for SDN and NFV management. In this demo, we describe a use case for NFV management using control theory and SDN. We use the management architecture of RINA (a clean-slate Recursive InterNetwork Architecture) to manage Virtual Network Function (VNF) instances over the GENI testbed. We deploy Snort, an Intrusion Detection System (IDS) as the VNF. Our network topology has source and destination hosts, multiple IDSes, an Open vSwitch (OVS) and an OpenFlow controller. A distributed management application running on RINA measures the state of the VNF instances and communicates this information to a Proportional Integral (PI) controller, which then provides load balancing information to the OpenFlow controller. The latter controller in turn updates traffic flow forwarding rules on the OVS switch, thus balancing load across the VNF instances. This demo demonstrates the benefits of using such a control-theoretic load balancing approach and the RINA management architecture in virtualized environments for NFV management. It also illustrates that the GENI testbed can easily support a wide range of SDN and NFV related experiments.
We consider the task of secure multi-party computation of arithmetic circuits over a finite field. Unlike Boolean circuits, arithmetic circuits allow natural computations on integers to be expressed easily and efficiently. In the strongest setting of malicious security with a dishonest majority –- where any number of parties may deviate arbitrarily from the protocol –- most existing protocols require expensive public-key cryptography for each multiplication in the preprocessing stage of the protocol, which leads to a high total cost. We present a new protocol that overcomes this limitation by using oblivious transfer to perform secure multiplications in general finite fields with reduced communication and computation. Our protocol is based on an arithmetic view of oblivious transfer, with careful consistency checks and other techniques to obtain malicious security at a cost of less than 6 times that of semi-honest security. We describe a highly optimized implementation together with experimental results for up to five parties. By making extensive use of parallelism and SSE instructions, we improve upon previous runtimes for MPC over arithmetic circuits by more than 200 times.
We present new applications for cryptographic secret handshakes between mobile devices on top of Bluetooth Low-Energy (LE). Secret handshakes enable mutual authentication, with the property that the parties learn nothing about each other unless they have been both issued credentials by a group administrator. This property provides strong privacy guarantees that enable interesting applications. One of them is proximity-based discovery for private communities. We introduce MASHaBLE, a mobile application that enables participants to discover and interact with nearby users if and only if they belong to the same secret community. We use direct peer-to-peer communication over Bluetooth LE, rather than relying on a central server. We discuss the specifics of implementing secret handshakes over Bluetooth LE and present our prototype implementation.
Masking requires splitting sensitive variables into at least d+1 shares to provide security against DPA attacks at order d. To this date, this minimal number has only been deployed in software implementations of cryptographic algorithms and in the linear parts of their hardware counterparts. So far there is no hardware construction that achieves this lower bound if the function is nonlinear and the underlying logic gates can glitch. In this paper, we give practical implementations of the AES using d+1 shares aiming at first- and second-order security even in the presence of glitches. To achieve this, we follow the conditions presented by Reparaz et al. at CRYPTO 2015 to allow hardware masking schemes, like Threshold Implementations, to provide theoretical higher-order security with d+1 shares. The decrease in number of shares has a direct impact in the area requirements: our second-order DPA resistant core is the smallest in area so far, and its S-box is 50% smaller than the current smallest Threshold Implementation of the AES S-box with similar security and attacker model. We assess the security of our masked cores by practical side-channel evaluations. The security guarantees are met with 100 million traces.
I will explain the linkage between threshold implementation masking schemes and multi-party computation. The basic principles that need to be taken from multi-party computation will be presented, as well as some basic protocols. The different natures of the resources and threat models between the two different applications of secret sharing will also be covered.
In an attempt to coerce useful information about the behavior of new malware families, threat analysts commonly force newly collected malicious software samples to run within a sandboxed environment. The main goal is to gather intelligence that can later be leveraged to detect and enumerate new malware infections within a network. Currently, most analysis environments "blindly" execute each newly collected malware sample for a predetermined amount of time (e.g., four to five minutes). However, a large majority of malware samples that are forced through sandbox execution are simply repackaged versions of previously seen (and already analyzed) malware. Consequently, a significant amount of time may be wasted in analyzing samples that do not generate new intelligence. In this paper, we propose MAXS, a novel probabilistic multi-hypothesis testing framework for scaling execution in malware analysis environments, including bare-metal execution environments. Our main goal is to automatically recognize whether a malware sample that is undergoing dynamic analysis has likely been seen before (e.g., in a "differently packed" form), and determine if we could therefore stop its execution early while avoiding loss of valuable malware intelligence (e.g., without missing DNS queries to never-before-seen malware command-and-control domains). We have tested our prototype implementation of MAXS over two large collections of malware execution traces obtained from two distinct production-level analysis environments. Our experimental results show that using MAXS we are able to reduce malware execution time by up to 50% in average, with less than 0.3% information loss. This roughly translates into the ability to double the capacity of malware sandbox environments, thus significantly optimizing the resources dedicated to malware execution and analysis. Our results are particularly important for bare-metal execution environments, in which it is not easy to leverage the economies of scale that characterize virtual-machine or emulation based malware sandboxes. For example, MAXS could be used to significantly cut the cost of bare-metal analysis environments by reducing the hardware resources needed to analyze a predetermined daily number of new malware samples.
The semantics of online authentication in the web are rather straightforward: if Alice has a certificate binding Bob's name to a public key, and if a remote entity can prove knowledge of Bob's private key, then (barring key compromise) that remote entity must be Bob. However, in reality, many websites' and the majority of the most popular ones-are hosted at least in part by third parties such as Content Delivery Networks (CDNs) or web hosting providers. Put simply: administrators of websites who deal with (extremely) sensitive user data are giving their private keys to third parties. Importantly, this sharing of keys is undetectable by most users, and widely unknown even among researchers. In this paper, we perform a large-scale measurement study of key sharing in today's web. We analyze the prevalence with which websites trust third-party hosting providers with their secret keys, as well as the impact that this trust has on responsible key management practices, such as revocation. Our results reveal that key sharing is extremely common, with a small handful of hosting providers having keys from the majority of the most popular websites. We also find that hosting providers often manage their customers' keys, and that they tend to react more slowly yet more thoroughly to compromised or potentially compromised keys.
The semantics of online authentication in the web are rather straightforward: if Alice has a certificate binding Bob's name to a public key, and if a remote entity can prove knowledge of Bob's private key, then (barring key compromise) that remote entity must be Bob. However, in reality, many websites' and the majority of the most popular ones-are hosted at least in part by third parties such as Content Delivery Networks (CDNs) or web hosting providers. Put simply: administrators of websites who deal with (extremely) sensitive user data are giving their private keys to third parties. Importantly, this sharing of keys is undetectable by most users, and widely unknown even among researchers. In this paper, we perform a large-scale measurement study of key sharing in today's web. We analyze the prevalence with which websites trust third-party hosting providers with their secret keys, as well as the impact that this trust has on responsible key management practices, such as revocation. Our results reveal that key sharing is extremely common, with a small handful of hosting providers having keys from the majority of the most popular websites. We also find that hosting providers often manage their customers' keys, and that they tend to react more slowly yet more thoroughly to compromised or potentially compromised keys.
IP tracking and cloaking are practices for identifying users which are used legitimately by websites to provide services and content tailored to particular users. However, it is believed that these practices are also used by malicious websites to avoid detection by anti-virus companies crawling the web to find malware. In addition, malicious websites are also believed to use IP tracking in order to deliver targeted malware based upon a history of previous visits by users. In this paper we empirically investigate these beliefs and collect a large dataset of suspicious URLs in order to identify at what level IP tracking takes place that is at the level of an individual address or at the level of their network provider or organisation (Network tracking). Our results illustrate that IP tracking is used in a small subset of domains within our dataset while no strong indication of network tracking was observed.