Biblio
Differential privacy is a precise mathematical constraint meant to ensure privacy of individual pieces of information in a database even while queries are being answered about the aggregate. Intuitively, one must come to terms with what differential privacy does and does not guarantee. For example, the definition prevents a strong adversary who knows all but one entry in the database from further inferring about the last one. This strong adversary assumption can be overlooked, resulting in misinterpretation of the privacy guarantee of differential privacy. Herein we give an equivalent definition of privacy using mutual information that makes plain some of the subtleties of differential privacy. The mutual-information differential privacy is in fact sandwiched between ε-differential privacy and (ε,δ)-differential privacy in terms of its strength. In contrast to previous works using unconditional mutual information, differential privacy is fundamentally related to conditional mutual information, accompanied by a maximization over the database distribution. The conceptual advantage of using mutual information, aside from yielding a simpler and more intuitive definition of differential privacy, is that its properties are well understood. Several properties of differential privacy are easily verified for the mutual information alternative, such as composition theorems.
Differential privacy has become the dominant standard in the research community for strong privacy protection. There has been a flood of research into query answering algorithms that meet this standard. Algorithms are becoming increasingly complex, and in particular, the performance of many emerging algorithms is data dependent, meaning the distribution of the noise added to query answers may change depending on the input data. Theoretical analysis typically only considers the worst case, making empirical study of average case performance increasingly important. In this paper we propose a set of evaluation principles which we argue are essential for sound evaluation. Based on these principles we propose DPBench, a novel evaluation framework for standardized evaluation of privacy algorithms. We then apply our benchmark to evaluate algorithms for answering 1- and 2-dimensional range queries. The result is a thorough empirical study of 15 published algorithms on a total of 27 datasets that offers new insights into algorithm behavior–-in particular the influence of dataset scale and shape–-and a more complete characterization of the state of the art. Our methodology is able to resolve inconsistencies in prior empirical studies and place algorithm performance in context through comparison to simple baselines. Finally, we pose open research questions which we hope will guide future algorithm design.
Recent advances in differential privacy make it possible to guarantee user privacy while preserving the main characteristics of the data. However, most differential privacy mechanisms assume that the underlying dataset is clean. This paper explores the link between data cleaning and differential privacy in a framework we call PrivateClean. PrivateClean includes a technique for creating private datasets of numerical and discrete-valued attributes, a formalism for privacy-preserving data cleaning, and techniques for answering sum, count, and avg queries after cleaning. We show: (1) how the degree of privacy affects subsequent aggregate query accuracy, (2) how privacy potentially amplifies certain types of errors in a dataset, and (3) how this analysis can be used to tune the degree of privacy. The key insight is to maintain a bipartite graph relating dirty values to clean values and use this graph to estimate biases due to the interaction between cleaning and privacy. We validate these results on four datasets with a variety of well-studied cleaning techniques including using functional dependencies, outlier filtering, and resolving inconsistent attributes.
We present sandbox mining, a technique to confine an application to resources accessed during automatic testing. Sandbox mining first explores software behavior by means of automatic test generation, and extracts the set of resources accessed during these tests. This set is then used as a sandbox, blocking access to resources not used during testing. The mined sandbox thus protects against behavior changes such as the activation of latent malware, infections, targeted attacks, or malicious updates. The use of test generation makes sandbox mining a fully automatic process that can be run by vendors and end users alike. Our BOXMATE prototype requires less than one hour to extract a sandbox from an Android app, with few to no confirmations required for frequently used functionality.
Touch input promises intuitive interactions with digital content as it employs our experience of manipulating physical objects: digital content can be rotated, scaled, and translated using direct manipulation gestures. However, the reliance on analog also confines the scope of direct physical manipulation: the physical world provides no mechanism to interact with digital abstract content. As such, applications on touchscreen devices either only include limited functionalities or fallback on the traditional form-filling paradigm, which is tedious, slow, and error prone for touch input. My research focuses on designing a new UI framework to enable complex functionalities on touch screen devices by expanding direct physical manipulation to abstract content via objectification. I present two research projects, objectification of attributes and selection, which demonstrate considerable promises.
Nowadays, the principle of image mining plays a vital role in various areas of our life, where numerous frameworks based on image mining are proposed for object recognition, object tracking, sensing images and medical image diagnosis. Nevertheless, the research in the image authentication based on image mining is still confined. Therefore, this paper comes to present an efficient engagement between the frequent pattern mining and digital watermarking to contribute significantly in the authentication of images transmitted via public networks. The proposed framework exploits some robust features of image to extract the frequent patterns in the image data. The maximal relevant patterns are used to discriminate between the textured and smooth blocks within the image, where the texture blocks are more appropriate to embed the secret data than smooth blocks. The experiment's result proves the efficiency of the proposed framework in terms of stabilization and robustness against different kind of attacks. The results are interesting and remarkable to preserve the image authentication.
Hardware support for isolated execution (such as Intel SGX) enables development of applications that keep their code and data confidential even while running in a hostile or compromised host. However, automatically verifying that such applications satisfy confidentiality remains challenging. We present a methodology for designing such applications in a way that enables certifying their confidentiality. Our methodology consists of forcing the application to communicate with the external world through a narrow interface, compiling it with runtime checks that aid verification, and linking it with a small runtime that implements the narrow interface. The runtime includes services such as secure communication channels and memory management. We formalize this restriction on the application as Information Release Confinement (IRC), and we show that it allows us to decompose the task of proving confidentiality into (a) one-time, human-assisted functional verification of the runtime to ensure that it does not leak secrets, (b) automatic verification of the application's machine code to ensure that it satisfies IRC and does not directly read or corrupt the runtime's internal state. We present /CONFIDENTIAL: a verifier for IRC that is modular, automatic, and keeps our compiler out of the trusted computing base. Our evaluation suggests that the methodology scales to real-world applications.
User engagement is recognized as an important component of the user experience, but relatively little is known about the effect of engagement on the learning outcomes of such interactions. This experimental user study examines the relationship between user engagement (UE) and comprehension in varied academic reading environments. Forty-one university students interacted with one of two sets of texts presented in 4 conditions in the context of preparing for a class assignment. Employing the User Engagement Scale (UES), we found evidence of a relationship between students' comprehension of the texts and their degree of engagement with them. However, this association was confined to one of the UES subscales and was not consistent across levels of engagement. An examination of additional variables found little evidence that system and content characteristics influenced engagement; however, we noted that all students' reported increased knowledge, but topical interest for non-engaged students declined. Results contribute to existing literature by adding further evidence that the relationship between engagement and comprehension is complex and mediated.
With the increasing capture of photos and their proliferation on social media, there is a pressing need for a more intuitive and versatile image search and exploration system. Image search systems have long been confined to the binds of the 2D legacy screens and the keyword text-box. With the recent advances in Virtual Reality (VR) technology, a move towards an immersive VR environment will redefine the image navigation experience. To this end, we propose a VR platform that gathers images from various sources, and addresses the 5 Ws of image search - what, where, when, who and why. We achieve this by providing the user with two modes of interactive exploration - (i) A mode that allows for a graph based navigation of an image dataset, using a steering wheel visualization, along multiple dimensions of time, location, visual concept, people, etc. and (ii) Another mode that provides an intuitive exploration of the image dataset using a logical hierarchy of visual concepts. Our contributions include creating a VR image exploration experience that is intuitive and allows image navigation along multiple dimensions.
We show for the first time that commodity devices can be used to generate wireless data transmissions that are confined to the human body. Specifically, we show that commodity input devices such as fingerprint sensors and touchpads can be used to transmit information to only wireless receivers that are in contact with the body. We characterize the propagation of the resulting transmissions across the whole body and run experiments with ten subjects to demonstrate that our approach generalizes across different body types and postures. We also evaluate our communication system in the presence of interference from other wearable devices such as smartwatches and nearby metallic surfaces. Finally, by modulating the operations of these input devices, we demonstrate bit rates of up to 50 bits per second over the human body.
System administrators have unlimited access to system resources. As the Snowden case shows, these permissions can be exploited to steal valuable personal, classified, or commercial data. In this work we propose a strategy that increases the organizational information security by constraining IT personnel's view of the system and monitoring their actions. To this end, we introduce the abstraction of perforated containers – while regular Linux containers are too restrictive to be used by system administrators, by "punching holes" in them, we strike a balance between information security and required administrative needs. Our system predicts which system resources should be accessible for handling each IT issue, creates a perforated container with the corresponding isolation, and deploys it in the corresponding machines as needed for fixing the problem. Under this approach, the system administrator retains his superuser privileges, while he can only operate within the container limits. We further provide means for the administrator to bypass the isolation, and perform operations beyond her boundaries. However, such operations are monitored and logged for later analysis and anomaly detection. We provide a proof-of-concept implementation of our strategy, along with a case study on the IT database of IBM Research in Israel.
This paper introduces quarantine, a new security primitive for an operating system to use in order to protect information and isolate malicious behavior. Quarantine's core feature is the ability to fork a protection domain on-the-fly to isolate a specific principal's execution of untrusted code without risk of a compromise spreading. Forking enables the OS to ensure service continuity by permitting even high-risk operations to proceed, albeit subject to greater scrutiny and constraints. Quarantine even partitions executing threads that share resources into isolated protection domains. We discuss the design and implementation of quarantine within the LockDown OS, a security-focused evolution of the Composite component-based microkernel OS. Initial performance results for quarantine show that about 98% of the overhead comes from the cost of copying memory to the new protection domain.
In this article, we describe a neighbour disjoint multipath (NDM) scheme that is shown to be more resilient amidst node or link failures compared to the two well-known node disjoint and edge disjoint multipath techniques. A centralised NDM was first conceptualised in our initial published work utilising the spatial diversity among multiple paths to ensure robustness against localised poor channel quality or node failures. Here, we further introduce a distributed version of our NDM algorithm adapting to the low-power and lossy network (LLN) characteristics. We implement our distributed NDM algorithm in Contiki OS on top of LOADng—a lightweight On-demand Ad hoc Distance Vector Routing protocol. We compare this implementation's performance with a standard IPv6 Routing Protocol for Low power and Lossy Networks (RPL), and also with basic LOADng, running in the Cooja simulator. Standard performance metrics such as packet delivery ratio, end-to-end latency, overhead and average routing table size are identified for the comparison. The results and observations are provided considering a few different application traffic patterns, which serve to quantify the improvements in robustness arising from NDM. The results are confirmed by experiments using a public sensor network testbed with over 100 nodes.
While power grid systems benefit from utilizing communication network through networked control and protection, the addition of communication exposes the power system to new security vulnerabilities and potential attacks. To mitigate these attacks, such as denial of service, intrusion detection systems (IDS) are often employed. In this paper we investigate the relationship of IDS accuracy performance to the stability of power systems via its impact on communication latency. Several IDS machine learning algorithms are implemented on the NSL-KDD dataset to obtain accuracy performance, and a mathematical model for computing the latency when incorporating IDS detection information during network routing is introduced. Simulation results on the New England 39-bus power system suggest that during a cyber-physical attack, a practical IDS can achieve similar stability as an ideal IDS with perfect detection. In addition, false positive rate has been found to have a larger impact than false negative rate under the simulation conditions studied. These observations can contribute to the design requirements of future embedded IDS solutions for power systems.
In a electrical distribution network, the challenges involved in the decentralized power generation and the resilience of the network to handle the failures, can be easily anticipated. With the use of information technology, a better control can be achieved over the distributed generation units and the fault handling in them. In this contribution, the use of a graceful degradation strategy is proposed as a means to improve the availability of the system during a fault situation. The Graceful degradation is presented as a constraint satisfaction problem. The trigger and the computation of the degradation process are formulated as the constraints. The concept of the utility of the resources is used to support a dynamic decision to trigger the degradation process. The computation of the graceful degradation strategy is formalized as an SMT problem and analyzed using the Z3 SMT-solver. The approach is illustrated with the help of a use case of applying the degradation strategy on a prosumer node during the power outage in the distribution network. It illustrates the dynamic calculation capability of the degradation scheme in the face of an unpredictable power from a renewable energy resource.
Wireless sensor-actuator networks (WSANs) are being adopted in process industries because of their advantages in lowering deployment and maintenance costs. While there has been significant theoretical advancement in networked control design, only limited empirical results that combine control design with realistic WSAN standards exist. This paper presents a cyber-physical case study on a wireless process control system that integrates state-of-the-art network control design and a WSAN based on the WirelessHART standard. The case study systematically explores the interactions between wireless routing and control design in the process control plant. The network supports alternative routing strategies, including single-path source routing and multi-path graph routing. To mitigate the effect of data loss in the WSAN, the control design integrates an observer based on an Extended Kalman Filter with a model predictive controller and an actuator buffer of recent control inputs. We observe that sensing and actuation can have different levels of resilience to packet loss under this network control design. We then propose a flexible routing approach where the routing strategy for sensing and actuation can be configured separately. Finally, we show that an asymmetric routing configuration with different routing strategies for sensing and actuation can effectively improve control performance under significant packet loss. Our results highlight the importance of co-joining the design of wireless network protocols and control in wireless control systems.
We introduce a scalable observer architecture to estimate the states of a discrete-time linear-time-invariant (LTI) system whose sensors can be manipulated by an attacker. Given the maximum number of attacked sensors, we build on previous results on necessary and sufficient conditions for state estimation, and propose a novel multi-modal Luenberger (MML) observer based on efficient Satisfiability Modulo Theory (SMT) solving. We present two techniques to reduce the complexity of the estimation problem. As a first strategy, instead of a bank of distinct observers, we use a family of filters sharing a single dynamical equation for the states, but different output equations, to generate estimates corresponding to different subsets of sensors. Such an architecture can reduce the memory usage of the observer from an exponential to a linear function of the number of sensors. We then develop an efficient SMT-based decision procedure that is able to reason about the estimates of the MML observer to detect at runtime which sets of sensors are attack-free, and use them to obtain a correct state estimate. We provide proofs of convergence for our algorithm and report simulation results to compare its runtime performance with alternative techniques. Our algorithm scales well for large systems (including up to 5000 sensors) for which many previously proposed algorithms are not implementable due to excessive memory and time requirements. Finally, we illustrate the effectiveness of our algorithm on the design of resilient power distribution systems.
Extensible Cyber-Physical Systems (CPS) are loosely connected, multi-domain platforms that "virtualize" their resources to provide an open platform capable of hosting different cyber-physical applications. These cyber-physical platforms are extensible since resources and applications can be added or removed at any time. However, realizing such platform requires resolving challenges emanating from different properties; for this paper, we focus on resilience. Resilience is important for extensible CPS to make sure that extensibility of a system doesn't result in failures and anomalies.
Energy management systems (EMS) are used to control energy usage in buildings and campuses, by employing technologies such as supervisory control and data acquisition (SCADA) and building management systems (BMS), in order to provide reliable energy supply and maximise user comfort while minimising energy usage. Historically, EMS systems were installed when potential security threats were only physical. Nowadays, EMS systems are connected to the building network and as a result directly to the outside world. This extends the attack surface to potential sophisticated cyber-attacks, which adversely impact EMS operation, resulting in service interruption and downstream financial implications. Currently, the security systems that detect attacks operate independently to those which deploy resiliency policies and use very basic methods. We propose a novel EMS cyber-physical-security framework that executes a resilient policy whenever an attack is detected using security analytics. In this framework, both the resilient policy and the security analytics are driven by EMS data, where the physical correlations between the data-points are identified to detect outliers and then the control loop is closed using an estimated value in place of the outlier. The framework has been tested using a reduced order model of a real EMS site.
Here we model the indirect costs of deploying security controls in small-to-medium enterprises (SMEs) to manage cyber threats. SMEs may not have the in-house skills and collective capacity to operate controls efficiently, resulting in inadvertent data leakage and exposure to compromise. Aside from financial costs, attempts to maintain security can impact morale, system performance, and retraining requirements, which are modelled here. Managing the overall complexity and effectiveness of an SME's security controls has the potential to reduce unintended leakage. The UK Cyber Essentials Scheme informs basic control definitions, and Available Responsibility Budget (ARB) is modelled to understand how controls can be prioritised for both security and usability. Human factors of security and practical experience of security management for SMEs inform the modelling of deployment challenges across a set of SME archetypes differing in size, complexity, and use of IT. Simple combinations of controls are matched to archetypes, balancing capabilities to protect data assets with the effort demands placed upon employees. Experiments indicate that two-factor authentication can be readily adopted by many SMEs and their employees to protect core assets, followed by correct access privileges and anti-malware software. Service and technology providers emerge as playing an important role in improving access to usable security controls for SMEs.
Modern Industrial Control Systems (ICS) rely on enterprise to plant floor connectivity. Where the size, diversity, and therefore complexity of ICS increase, operational requirements, goals, and challenges defined by users across various sub-systems follow. Recent trends in Information Technology (IT) and Operational Technology (OT) convergence may cause operators to lose a comprehensive understanding of end-to-end data flow requirements. This presents a risk to system security and resilience. Sensors were once solely applied for operational process use, but now act as inputs supporting a diverse set of organisational requirements. If these are not fully understood, incomplete risk assessment, and inappropriate implementation of security controls could occur. In search of a solution, operators may turn to standards and guidelines. This paper reviews popular standards and guidelines, prior to the presentation of a case study and conceptual tool, highlighting the importance of data flows, critical data processing points, and system-to-user relationships. The proposed approach forms a basis for risk assessment and security control implementation, aiding the evolution of ICS security and resilience.
This article focuses on the design of safe and attack-resilient Cyber-Physical Systems (CPS) equipped with multiple sensors measuring the same physical variable. A malicious attacker may be able to disrupt system performance through compromising a subset of these sensors. Consequently, we develop a precise and resilient sensor fusion algorithm that combines the data received from all sensors by taking into account their specified precisions. In particular, we note that in the presence of a shared bus, in which messages are broadcast to all nodes in the network, the attacker’s impact depends on what sensors he has seen before sending the corrupted measurements. Therefore, we explore the effects of communication schedules on the performance of sensor fusion and provide theoretical and experimental results advocating for the use of the Ascending schedule, which orders sensor transmissions according to their precision starting from the most precise. In addition, to improve the accuracy of the sensor fusion algorithm, we consider the dynamics of the system in order to incorporate past measurements at the current time. Possible ways of mapping sensor measurement history are investigated in the article and are compared in terms of the confidence in the final output of the sensor fusion. We show that the precision of the algorithm using history is never worse than the no-history one, while the benefits may be significant. Furthermore, we utilize the complementary properties of the two methods and show that their combination results in a more precise and resilient algorithm. Finally, we validate our approach in simulation and experiments on a real unmanned ground robot.
From pencils to commercial aircraft, every man-made object must be designed and manufactured. When it is cheaper or easier to steal a design or a manufacturing process specification than to invent one's own, the incentive for theft is present. As more and more manufacturing data comes online, incidents of such theft are increasing. In this paper, we present a side-channel attack on manufacturing equipment that reveals both the form of a product and its manufacturing process, i.e., exactly how it is made. In the attack, a human deliberately or accidentally places an attack-enabled phone close to the equipment or makes or receives a phone call on any phone nearby. The phone executing the attack records audio and, optionally, magnetometer data. We present a method of reconstructing the product's form and manufacturing process from the captured data, based on machine learning, signal processing, and human assistance. We demonstrate the attack on a 3D printer and a CNC mill, each with its own acoustic signature, and discuss the commonalities in the sensor data captured for these two different machines. We compare the quality of the data captured with a variety of smartphone models. Capturing data from the 3D printer, we reproduce the form and process information of objects previously unknown to the reconstructors. On average, our accuracy is within 1 mm in reconstructing the length of a line segment in a fabricated object's shape and within 1 degree in determining an angle in a fabricated object's shape. We conclude with recommendations for defending against these attacks.
The following decade will witness a surge in remote health-monitoring systems that are based on body-worn monitoring devices. These Medical Cyber Physical Systems (MCPS) will be capable of transmitting the acquired data to a private or public cloud for storage and processing. Machine learning algorithms running in the cloud and processing this data can provide decision support to healthcare professionals. There is no doubt that the security and privacy of the medical data is one of the most important concerns in designing an MCPS. In this paper, we depict the general architecture of an MCPS consisting of four layers: data acquisition, data aggregation, cloud processing, and action. Due to the differences in hardware and communication capabilities of each layer, different encryption schemes must be used to guarantee data privacy within that layer. We survey conventional and emerging encryption schemes based on their ability to provide secure storage, data sharing, and secure computation. Our detailed experimental evaluation of each scheme shows that while the emerging encryption schemes enable exciting new features such as secure sharing and secure computation, they introduce several orders-of-magnitude computational and storage overhead. We conclude our paper by outlining future research directions to improve the usability of the emerging encryption schemes in an MCPS.
We examine the security of home smart locks: cyber-physical devices that replace traditional door locks with deadbolts that can be electronically controlled by mobile devices or the lock manufacturer's remote servers. We present two categories of attacks against smart locks and analyze the security of five commercially-available locks with respect to these attacks. Our security analysis reveals that flaws in the design, implementation, and interaction models of existing locks can be exploited by several classes of adversaries, allowing them to learn private information about users and gain unauthorized home access. To guide future development of smart locks and similar Internet of Things devices, we propose several defenses that mitigate the attacks we present. One of these defenses is a novel approach to securely and usably communicate a user's intended actions to smart locks, which we prototype and evaluate. Ultimately, our work takes a first step towards illuminating security challenges in the system design and novel functionality introduced by emerging IoT systems.