Biblio
Databases have become one of the most important components in modern software systems. For example, web services, cloud computing systems, and online transaction processing systems all rely heavily on databases. To abstract the complexity of accessing a database, developers make use of Object-Relational Mapping (ORM) frameworks. ORM frameworks provide an abstraction layer between the application logic and the underlying database. Such abstraction layer automatically maps objects in Object-Oriented Languages to database records, which significantly reduces the amount of boilerplate code that needs to be written. Despite the advantages of using ORM frameworks, we observe several difficulties in maintaining ORM code (i.e., code that makes use of ORM frameworks) when cooperating with our industrial partner. After conducting studies on other open source systems, we find that such difficulties are common in other Java systems. Our study finds that i) ORM cannot completely encapsulate database accesses in objects or abstract the underlying database technology, thus may cause ORM code changes more scattered; ii) ORM code changes are more frequent than regular code, but there is a lack of tools that help developers verify ORM code at compilation time; iii) we find that changes to ORM code are more commonly due to performance or security reasons; however, traditional static code analyzers need to be extended to capture the peculiarities of ORM code in order to detect such problems. Our study highlights the hidden maintenance costs of using ORM frameworks, and provides some initial insights about potential approaches to help maintain ORM code. Future studies should carefully examine ORM code, especially given the rising use of ORM in modern software systems.
Mobile devices, such as smarthphones, became a common tool in our daily routine. Mobile Applications (a.k.a. apps) are demanding access to contextual information increasingly. For instance, apps require user's environment data as well as their profiles in order to adapt themselves (interfaces, services, content) according to this context data. Mobile apps with this behavior are known as context-aware applications (CAS). Several software infrastructures have been created to help the development of CAS. However, most of them do not store the contextual data, once mobile devices are resource constrained. They are not built taking into account the privacy of contextual data either, due the fact that apps may expose contextual data, without user consent. This paper addresses these topics by extending an existing middleware platform that help the development of mobile context-aware applications. Our extension aims at store and process the contextual data generated from several mobile devices, using the computational power of the cloud, and the definition of privacy policies, which avoid dissemination of unauthorized contextual data.
We present RamCrypt, a solution that allows unmodified Linux processes to transparently work on encrypted data. RamCrypt can be deployed and enabled on a per-process basis without recompiling user-mode applications. In every enabled process, data is only stored in cleartext for the moment it is processed, and otherwise stays encrypted in RAM. In particular, the required encryption keys do not reside in RAM, but are stored in CPU registers only. Hence, RamCrypt effectively thwarts memory disclosure attacks, which grant unauthorized access to process memory, as well as physical attacks such as cold boot and DMA attacks. In its default configuration, RamCrypt exposes only up to 4 memory pages in cleartext at the same time. For the nginx web server serving encrypted HTTPS pages under heavy load, the necessary TLS secret key is hidden for 97% of its time.
Modern OS kernels including Windows, Linux, and Mac OS all have adopted kernel Address Space Layout Randomization (ASLR), which shifts the base address of kernel code and data into different locations in different runs. Consequently, when performing introspection or forensic analysis of kernel memory, we cannot use any pre-determined addresses to interpret the kernel events. Instead, we must derandomize the address space layout and use the new addresses. However, few efforts have been made to derandomize the kernel address space and yet there are many questions left such as which approach is more efficient and robust. Therefore, we present the first systematic study of how to derandomize a kernel when given a memory snapshot of a running kernel instance. Unlike the derandomization approaches used in traditional memory exploits in which only remote access is available, with introspection and forensics applications, we can use all the information available in kernel memory to generate signatures and derandomize the ASLR. In other words, there exists a large volume of solutions for this problem. As such, in this paper we examine a number of typical approaches to generate strong signatures from both kernel code and data based on the insight of how kernel code and data is updated, and compare them from efficiency (in terms of simplicity, speed etc.) and robustness (e.g., whether the approach is hard to be evaded or forged) perspective. In particular, we have designed four approaches including brute-force code scanning, patched code signature generation, unpatched code signature generation, and read-only pointer based approach, according to the intrinsic behavior of kernel code and data with respect to kernel ASLR. We have gained encouraging results for each of these approaches and the corresponding experimental results are reported in this paper.
MPI includes all processes in MPI\_COMM\_WORLD; this is untenable for reasons of scale, resiliency, and overhead. This paper offers a new approach, extending MPI with a new concept called Sessions, which makes two key contributions: a tighter integration with the underlying runtime system; and a scalable route to communication groups. This is a fundamental change in how we organise and address MPI processes that removes well-known scalability barriers by no longer requiring the global communicator MPI\_COMM\_WORLD.
All modern web browsers –- Internet Explorer, Firefox, Chrome, Opera, and Safari –- have a core rendering engine written in C++. This language choice was made because it affords the systems programmer complete control of the underlying hardware features and memory in use, and it provides a transparent compilation model. Unfortunately, this language is complex (especially to new contributors!), challenging to write correct parallel code in, and highly susceptible to memory safety issues that potentially lead to security holes. Servo is a project started at Mozilla Research to build a new web browser engine that preserves the capabilities of these other browser engines but also both takes advantage of the recent trends in parallel hardware and is more memory-safe. We use a new language, Rust, that provides us a similar level of control of the underlying system to C++ but which statically prevents many memory safety issues and provides direct support for parallelism and concurrency. In this paper, we show how a language with an advanced type system can address many of the most common security issues and software engineering challenges in other browser engines, while still producing code that has the same performance and memory profile. This language is also quite accessible to new open source contributors and employees, even those without a background in C++ or systems programming. We also outline several pitfalls encountered along the way and describe some potential areas for future improvement.
Lattice-based cryptography has gained credence recently as a replacement for current public-key cryptosystems, due to its quantum-resilience, versatility, and relatively low key sizes. To date, encryption based on the learning with errors (LWE) problem has only been investigated from an ideal lattice standpoint, due to its computation and size efficiencies. However, a thorough investigation of standard lattices in practice has yet to be considered. Standard lattices may be preferred to ideal lattices due to their stronger security assumptions and less restrictive parameter selection process. In this paper, an area-optimised hardware architecture of a standard lattice-based cryptographic scheme is proposed. The design is implemented on a FPGA and it is found that both encryption and decryption fit comfortably on a Spartan-6 FPGA. This is the first hardware architecture for standard lattice-based cryptography reported in the literature to date, and thus is a benchmark for future implementations. Additionally, a revised discrete Gaussian sampler is proposed which is the fastest of its type to date, and also is the first to investigate the cost savings of implementing with λ/2-bits of precision. Performance results are promising compared to the hardware designs of the equivalent ring-LWE scheme, which in addition to providing stronger security proofs; generate 1272 encryptions per second and 4395 decryptions per second.
Funded under the European Union's Horizon 2020 research and innovation programme, SAFEcrypto will provide a new generation of practical, robust and physically secure post-quantum cryptographic solutions that ensure long-term security for future ICT systems, services and applications. The project will focus on the remarkably versatile field of Lattice-based cryptography as the source of computational hardness, and will deliver optimised public key security primitives for digital signatures and authentication, as well identity based encryption (IBE) and attribute based encryption (ABE). This will involve algorithmic and design optimisations, and implementations of lattice-based cryptographic schemes addressing cost, energy consumption, performance and physical robustness. As the National Institute of Standards and Technology (NIST) prepares for the transition to a post-quantum cryptographic suite B, urging organisations that build systems and infrastructures that require long-term security to consider this transition in architectural designs; the SAFEcrypto project will provide Proof-of-concept demonstrators of schemes for three practical real-world case studies with long-term security requirements, in the application areas of satellite communications, network security and cloud. The goal is to affirm Lattice-based cryptography as an effective replacement for traditional number-theoretic public-key cryptography, by demonstrating that it can address the needs of resource-constrained embedded applications, such as mobile and battery-operated devices, and of real-time high performance applications for cloud and network management infrastructures.
In order to retrieve the secret key in a side-channel attack, the attacker computes distinguisher values using all the available data. A profiling stage is very useful to provide some a priori information about the leakage model. However, profiling is essentially empirical and may not be exhaustive. Therefore, during the attack, the attacker may come up on previously unseen data, which can be troublesome. A lazy workaround is to ignore all such novel observations altogether. In this paper, we show that this is not optimal and can be avoided. Our proposed techniques eventually improve the performance of classical information-theoretic distinguishers in terms of success rate.
Differential privacy is a promising formal approach to data privacy, which provides a quantitative bound on the privacy cost of an algorithm that operates on sensitive information. Several tools have been developed for the formal verification of differentially private algorithms, including program logics and type systems. However, these tools do not capture fundamental techniques that have emerged in recent years, and cannot be used for reasoning about cutting-edge differentially private algorithms. Existing techniques fail to handle three broad classes of algorithms: 1) algorithms where privacy depends on accuracy guarantees, 2) algorithms that are analyzed with the advanced composition theorem, which shows slower growth in the privacy cost, 3) algorithms that interactively accept adaptive inputs. We address these limitations with a new formalism extending apRHL, a relational program logic that has been used for proving differential privacy of non-interactive algorithms, and incorporating aHL, a (non-relational) program logic for accuracy properties. We illustrate our approach through a single running example, which exemplifies the three classes of algorithms and explores new variants of the Sparse Vector technique, a well-studied algorithm from the privacy literature. We implement our logic in EasyCrypt, and formally verify privacy. We also introduce a novel coupling technique called optimal subset coupling that may be of independent interest.
Recent advances in differential privacy make it possible to guarantee user privacy while preserving the main characteristics of the data. However, most differential privacy mechanisms assume that the underlying dataset is clean. This paper explores the link between data cleaning and differential privacy in a framework we call PrivateClean. PrivateClean includes a technique for creating private datasets of numerical and discrete-valued attributes, a formalism for privacy-preserving data cleaning, and techniques for answering sum, count, and avg queries after cleaning. We show: (1) how the degree of privacy affects subsequent aggregate query accuracy, (2) how privacy potentially amplifies certain types of errors in a dataset, and (3) how this analysis can be used to tune the degree of privacy. The key insight is to maintain a bipartite graph relating dirty values to clean values and use this graph to estimate biases due to the interaction between cleaning and privacy. We validate these results on four datasets with a variety of well-studied cleaning techniques including using functional dependencies, outlier filtering, and resolving inconsistent attributes.
Nowadays, the principle of image mining plays a vital role in various areas of our life, where numerous frameworks based on image mining are proposed for object recognition, object tracking, sensing images and medical image diagnosis. Nevertheless, the research in the image authentication based on image mining is still confined. Therefore, this paper comes to present an efficient engagement between the frequent pattern mining and digital watermarking to contribute significantly in the authentication of images transmitted via public networks. The proposed framework exploits some robust features of image to extract the frequent patterns in the image data. The maximal relevant patterns are used to discriminate between the textured and smooth blocks within the image, where the texture blocks are more appropriate to embed the secret data than smooth blocks. The experiment's result proves the efficiency of the proposed framework in terms of stabilization and robustness against different kind of attacks. The results are interesting and remarkable to preserve the image authentication.
We show for the first time that commodity devices can be used to generate wireless data transmissions that are confined to the human body. Specifically, we show that commodity input devices such as fingerprint sensors and touchpads can be used to transmit information to only wireless receivers that are in contact with the body. We characterize the propagation of the resulting transmissions across the whole body and run experiments with ten subjects to demonstrate that our approach generalizes across different body types and postures. We also evaluate our communication system in the presence of interference from other wearable devices such as smartwatches and nearby metallic surfaces. Finally, by modulating the operations of these input devices, we demonstrate bit rates of up to 50 bits per second over the human body.
In a electrical distribution network, the challenges involved in the decentralized power generation and the resilience of the network to handle the failures, can be easily anticipated. With the use of information technology, a better control can be achieved over the distributed generation units and the fault handling in them. In this contribution, the use of a graceful degradation strategy is proposed as a means to improve the availability of the system during a fault situation. The Graceful degradation is presented as a constraint satisfaction problem. The trigger and the computation of the degradation process are formulated as the constraints. The concept of the utility of the resources is used to support a dynamic decision to trigger the degradation process. The computation of the graceful degradation strategy is formalized as an SMT problem and analyzed using the Z3 SMT-solver. The approach is illustrated with the help of a use case of applying the degradation strategy on a prosumer node during the power outage in the distribution network. It illustrates the dynamic calculation capability of the degradation scheme in the face of an unpredictable power from a renewable energy resource.
Wireless sensor-actuator networks (WSANs) are being adopted in process industries because of their advantages in lowering deployment and maintenance costs. While there has been significant theoretical advancement in networked control design, only limited empirical results that combine control design with realistic WSAN standards exist. This paper presents a cyber-physical case study on a wireless process control system that integrates state-of-the-art network control design and a WSAN based on the WirelessHART standard. The case study systematically explores the interactions between wireless routing and control design in the process control plant. The network supports alternative routing strategies, including single-path source routing and multi-path graph routing. To mitigate the effect of data loss in the WSAN, the control design integrates an observer based on an Extended Kalman Filter with a model predictive controller and an actuator buffer of recent control inputs. We observe that sensing and actuation can have different levels of resilience to packet loss under this network control design. We then propose a flexible routing approach where the routing strategy for sensing and actuation can be configured separately. Finally, we show that an asymmetric routing configuration with different routing strategies for sensing and actuation can effectively improve control performance under significant packet loss. Our results highlight the importance of co-joining the design of wireless network protocols and control in wireless control systems.
Extensible Cyber-Physical Systems (CPS) are loosely connected, multi-domain platforms that "virtualize" their resources to provide an open platform capable of hosting different cyber-physical applications. These cyber-physical platforms are extensible since resources and applications can be added or removed at any time. However, realizing such platform requires resolving challenges emanating from different properties; for this paper, we focus on resilience. Resilience is important for extensible CPS to make sure that extensibility of a system doesn't result in failures and anomalies.
Modern Industrial Control Systems (ICS) rely on enterprise to plant floor connectivity. Where the size, diversity, and therefore complexity of ICS increase, operational requirements, goals, and challenges defined by users across various sub-systems follow. Recent trends in Information Technology (IT) and Operational Technology (OT) convergence may cause operators to lose a comprehensive understanding of end-to-end data flow requirements. This presents a risk to system security and resilience. Sensors were once solely applied for operational process use, but now act as inputs supporting a diverse set of organisational requirements. If these are not fully understood, incomplete risk assessment, and inappropriate implementation of security controls could occur. In search of a solution, operators may turn to standards and guidelines. This paper reviews popular standards and guidelines, prior to the presentation of a case study and conceptual tool, highlighting the importance of data flows, critical data processing points, and system-to-user relationships. The proposed approach forms a basis for risk assessment and security control implementation, aiding the evolution of ICS security and resilience.
From pencils to commercial aircraft, every man-made object must be designed and manufactured. When it is cheaper or easier to steal a design or a manufacturing process specification than to invent one's own, the incentive for theft is present. As more and more manufacturing data comes online, incidents of such theft are increasing. In this paper, we present a side-channel attack on manufacturing equipment that reveals both the form of a product and its manufacturing process, i.e., exactly how it is made. In the attack, a human deliberately or accidentally places an attack-enabled phone close to the equipment or makes or receives a phone call on any phone nearby. The phone executing the attack records audio and, optionally, magnetometer data. We present a method of reconstructing the product's form and manufacturing process from the captured data, based on machine learning, signal processing, and human assistance. We demonstrate the attack on a 3D printer and a CNC mill, each with its own acoustic signature, and discuss the commonalities in the sensor data captured for these two different machines. We compare the quality of the data captured with a variety of smartphone models. Capturing data from the 3D printer, we reproduce the form and process information of objects previously unknown to the reconstructors. On average, our accuracy is within 1 mm in reconstructing the length of a line segment in a fabricated object's shape and within 1 degree in determining an angle in a fabricated object's shape. We conclude with recommendations for defending against these attacks.
The identification of relevant information in large text databases is a challenging task. One of the reasons is human beings' limitations in handling large volumes of data. A common solution for scavenging data from texts are word clouds. A word cloud illustrates word usage in a document by resizing individual words in documents proportionally to how frequently they appear. Even though word clouds are easy to understand, they are not particularly efficient, because they are static. In addition, the presented information lacks context, i.e., words are not explained and they may lead to radically erroneous interpretations. To tackle these problems we developed VCloud, a tool that allows the user to interact with word clouds, therefore allowing informative and interactive data exploration. Furthermore, our tool also allows one to compare two data sets presented as word clouds. We evaluated VCloud using real data about the evolution of gastritis research through the years. The papers indexed by Pubmed related to this medical context were selected for visualization and data analysis using VCloud. A domain expert explored these visualizations, being able to extract useful information from it. This illustrates how can VCloud be a valuable tool for visual text analytics.
In this article, I present the questions that I seek to answer in my PhD research. I posit to analyze natural language text with the help of semantic annotations and mine important events for navigating large text corpora. Semantic annotations such as named entities, geographic locations, and temporal expressions can help us mine events from the given corpora. These events thus provide us with useful means to discover the locked knowledge in them. I pose three problems that can help unlock this knowledge vault in semantically annotated text corpora: i. identifying important events; ii. semantic search; iii. and event analytics.
Today, mobile data owners lack consent and control over the release and utilization of their location data. Third party applications continuously process and access location data without data owners granular control and without knowledge of how location data is being used. The proliferation of GPS enabled IoT devices will lead to larger scale abuses of trust. In this paper we present the first design and implementation of a privacy module built into the GPSD daemon. The GPSD daemon is a low-level GPS interface that runs on GPS enabled devices. The integration of the privacy module ensures that data owners have granular control over the release of their GPS location. We describe the design of our privacy module integration into the GPSD daemon.
Security and privacy are crucial for all IT systems and services. The diversity of applications places high demands on the knowledge and experience of software developers and IT professionals. Besides programming skills, security and privacy aspects are required as well and must be considered during development. If developers have not been trained in these topics, it is especially difficult for them to prevent problematic security issues such as vulnerabilities. In this work we present an interactive e-learning platform focusing on solving real-world cybersecurity tasks in a sandboxed web environment. With our platform students can learn and understand how security vulnerabilities can be exploited in different scenarios. The platform has been evaluated in four university IT security courses with around 1100 participants over three years.
Nowadays, both the amount of cyberattacks and their sophistication have considerably increased, and their prevention concerns many organizations. Cooperation by means of information sharing is a promising strategy to address this problem, but unfortunately it poses many challenges. Indeed, looking for a win-win environment is not straightforward and organizations are not properly motivated to share information. This work presents a model to analyse the benefits and drawbacks of information sharing among organizations that present a certain level of dependency. The proposed model applies functional dependency network analysis to emulate attacks propagation and game theory for information sharing management. We present a simulation framework implementing the model that allows for testing different sharing strategies under several network and attack settings. Experiments using simulated environments show how the proposed model provides insights on which conditions and scenarios are beneficial for information sharing.
REMAN is a reputation management infrastructure for composite Web services. It supports the aggregation of client feedback on the perceived QoS of external services, using reputation mechanisms to build service rankings. Changes in rankings are pro-actively notified to composite service clients to enable self-tuning properties in their execution.
In this paper we propose a mechanism of prediction of domestic human activity in a smart home context. We use those predictions to adapt the behavior of home appliances whose impact on the environment is delayed (for example the heating). The behaviors of appliances are built by a reinforcement learning mechanism. We compare the behavior built by the learning approach with both a merely reactive behavior and a state-remanent behavior.