Biblio
HDFS has been widely used for storing massive scale data which is vulnerable to site disaster. The file system backup is an important strategy for data retention. In this paper, we present an efficient, easy- to-use Backup and Disaster Recovery System for HDFS. The system includes a client based on HDFS with additional feature of remote backup, and a remote server with a HDFS cluster to keep the backup data. It supports full backup and regularly incremental backup to the server with very low cost and high throughout. In our experiment, the average speed of backup and recovery is up to 95 MB/s, approaching the theoretical maximum speed of gigabit Ethernet.
As the Internet becomes an important part of the infrastructure our society depends on, it is crucial to construct networks that are able to work even when part of the network is compromised. This paper presents the first practical intrusion-tolerant network service, targeting high-value applications such as monitoring and control of global clouds and management of critical infrastructure for the power grid. We use an overlay approach to leverage the existing IP infrastructure while providing the required resiliency and timeliness. Our solution overcomes malicious attacks and compromises in both the underlying network infrastructure and in the overlay itself. We deploy and evaluate the intrusion-tolerant overlay implementation on a global cloud spanning East Asia, North America, and Europe, and make it publicly available.
In a number of information security scenarios, human beings can be better than technical security measures at detecting threats. This is particularly the case when a threat is based on deception of the user rather than exploitation of a specific technical flaw, as is the case of spear-phishing, application spoofing, multimedia masquerading and other semantic social engineering attacks. Here, we put the concept of the human-as-a-security-sensor to the test with a first case study on a small number of participants subjected to different attacks in a controlled laboratory environment and provided with a mechanism to report these attacks if they spot them. A key challenge is to estimate the reliability of each report, which we address with a machine learning approach. For comparison, we evaluate the ability of known technical security countermeasures in detecting the same threats. This initial proof of concept study shows that the concept is viable.
Reliable detection of intrusion is the basis of safety in cognitive radio networks (CRNs). So far, few scholars applied intrusion detection systems (IDSs) to combat intrusion against CRNs. In order to improve the performance of intrusion detection in CRNs, a distributed intrusion detection scheme has been proposed. In this paper, a method base on Dempster-Shafer's (D-S) evidence theory to detect intrusion in CRNs is put forward, in which the detection data and credibility of different local IDS Agent is combined by D-S in the cooperative detection center, so that different local detection decisions are taken into consideration in the final decision. The effectiveness of the proposed scheme is verified by simulation, and the results reflect a noticeable performance improvement between the proposed scheme and the traditional method.
Internet of Things (IoT) depicts an intelligent future, where any IoT-based devices having a sensorial and computing capabilities to interact with each other. Recently, we are living in the area of internet and rapidly moving towards a smart planet where devices are capable to be connected to each other. Cooperative ad-hoc vehicle systems are the main driving force for the actualization of IoT-based concept. Vehicular Ad-hoc Network (VANET) is considered as a promising platform for the intelligent wireless communication system. This paper presents and analyzes the tradeoffs between the security and reliability of the IoT-based VANET system in the presence of eavesdropping attacks using smart vehicle relays based on opportunistic relay selection (ORS) scheme. Then, the optimization of the distance between the source (S), destination (D), and Eavesdropper (E) is illustrated in details, showing the effect of this parameter on the IoT-based network. In order to improve the SRT, we quantify the attainable SRT improvement with variable distances between IoT-based nodes. It is shown that given the maximum tolerable Intercept Probability (IP), the Outage Probability (OP) of our proposed model approaches zero for Ge → ∞, where Ge is distance ratio between S — E via the vehicle relay (R).
We consider the problem of covert communication over a state-dependent channel, where the transmitter has non-causal knowledge of the channel states. Here, “covert” means that the probability that a warden on the channel can detect the communication must be small. In contrast with traditional models without noncausal channel-state information at the transmitter, we show that covert communication can be possible with positive rate. We derive closed-form formulas for the maximum achievable covert communication rate (“covert capacity”) in this setting for discrete memoryless channels as well as additive white Gaussian noise channels. We also derive lower bounds on the rate of the secret key that is needed for the transmitter and the receiver to achieve the covert capacity.
This paper proposes a context-aware, graph-based approach for identifying anomalous user activities via user profile analysis, which obtains a group of users maximally similar among themselves as well as to the query during test time. The main challenges for the anomaly detection task are: (1) rare occurrences of anomalies making it difficult for exhaustive identification with reasonable false-alarm rate, and (2) continuously evolving new context-dependent anomaly types making it difficult to synthesize the activities apriori. Our proposed query-adaptive graph-based optimization approach, solvable using maximum flow algorithm, is designed to fully utilize both mutual similarities among the user models and their respective similarities with the query to shortlist the user profiles for a more reliable aggregated detection. Each user activity is represented using inputs from several multi-modal resources, which helps to localize anomalies from time-dependent data efficiently. Experiments on public datasets of insider threats and gesture recognition show impressive results.
Today's control systems such as smart environments have the ability to adapt to their environment in order to achieve a set of objectives (e.g., comfort, security and energy savings). This is done by changing their behaviour upon the occurrence of specific events. Building such a system requires to design and implement autonomic loops that collect events and measurements, make decisions and execute the corresponding actions.The design and the implementation of such loops are made difficult by several factors: the complexity of systems with multiple objectives, the risk of conflicting decisions between multiple loops, the inconsistencies that can result from communication errors and hardware failures and the heterogeneity of the devices.In this paper, we propose a design framework for reliable and self-adaptive systems, where multiple autonomic loops can be composed into complex managers, and we consider its application to smart environments. We build upon the proposed framework a generic autonomic loop which combines an automata-based controller that makes correct and coherent decisions, a transactional execution mechanism that avoids inconsistencies, and an abstraction layer that hides the heterogeneity of the devices.We propose patterns for composition of such loops, in parallel, coordinated, and hierarchically, with benefits from the leveraging of automata-based modular constructs, that provides for guarantees on the correct behaviour of the controlled system. We implement our framework with the transactional middleware LINC, the reactive language Heptagon/BZR and the abstraction framework PUTUTU. A case study in the field of building automation is presented to illustrate the proposed framework.
Network systems, such as transportation systems and water supply systems, play important roles in our daily life and industrial production. However, a variety of disruptive events occur during their life time, causing a series of serious losses. Due to the inevitability of disruption, we should not only focus on improving the reliability or the resistance of the system, but also pay attention to the ability of the system to response timely and recover rapidly from disruptive events. That is to say we need to pay more attention to the resilience. In this paper, we describe two resilience models, quotient resilience and integral resilience, to measure the final recovered performance and the performance cumulative process during recovery respectively. Based on these two models, we implement the optimization of the system recovery strategies after disruption, focusing on the repair sequence of the damaged components and the allocation scheme of resource. The proposed research in this paper can serve as guidance to prioritize repair tasks and allocate resource reasonably.
Due to the growing performance requirements, embedded systems are increasingly more complex. Meanwhile, they are also expected to be reliable. Guaranteeing reliability on complex systems is very challenging. Consequently, there is a substantial need for designs that enable the use of unverified components such as real-time operating system (RTOS) without requiring their correctness to guarantee safety. In this work, we propose a novel approach to design a controller that enables the system to restart and remain safe during and after the restart. Complementing this controller with a switching logic allows the system to use complex, unverified controller to drive the system as long as it does not jeopardize safety. Such a design also tolerates faults that occur in the underlying software layers such as RTOS and middleware and recovers from them through system-level restarts that reinitialize the software (middleware, RTOS, and applications) from a read-only storage. Our approach is implementable using one commercial off-the-shelf (COTS) processing unit. To demonstrate the efficacy of our solution, we fully implement a controller for a 3 degree of freedom (3DOF) helicopter. We test the system by injecting various types of faults into the applications and RTOS and verify that the system remains safe.
This paper proposes a practical time-phased model to analyze the vulnerability of power systems over a time horizon, in which the scheduled maintenance of network facilities is considered. This model is deemed as an efficient tool that could be used by system operators to assess whether how their systems become vulnerable giving a set of scheduled facility outages. The final model is presented as a single level Mixed-Integer Linear Programming (MILP) problem solvable with commercially available software. Results attained based on the well-known IEEE 24-Bus Reliability Test System (RTS) appreciate the applicability of the model and highlight the necessity of considering the scheduled facility outages in assessing the vulnerability of a power system.
This work presents a highly reliable and tamper-resistant design of Physical Unclonable Function (PUF) exploiting Resistive Random Access Memory (RRAM). The RRAM PUF properties such as uniqueness and reliability are experimentally measured on 1 kb HfO2 based RRAM arrays. Firstly, our experimental results show that selection of the split reference and offset of the split sense amplifier (S/A) significantly affect the uniqueness. More dummy cells are able to generate a more accurate split reference, and relaxing transistor's sizes of the split S/A can reduce the offset, thus achieving better uniqueness. The average inter-Hamming distance (HD) of 40 RRAM PUF instances is 42%. Secondly, we propose using the sum of the read-out currents of multiple RRAM cells for generating one response bit, which statistically minimizes the risk of early retention failure of a single cell. The measurement results show that with 8 cells per bit, 0% intra-HD can maintain more than 50 hours at 150 °C or equivalently 10 years at 69 °C by 1/kT extrapolation. Finally, we propose a layout obfuscation scheme where all the S/A are randomly embedded into the RRAM array to improve the RRAM PUF's resistance against invasive tampering. The RRAM cells are uniformly placed between M4 and M5 across the array. If the adversary attempts to invasively probe the output of the S/A, he has to remove the top-level interconnect and destroy the RRAM cells between the interconnect layers. Therefore, the RRAM PUF has the “self-destructive” feature. The hardware overhead of the proposed design strategies is benchmarked in 64 × 128 RRAM PUF array at 65 nm, while these proposed optimization strategies increase latency, energy and area over a naive implementation, they significantly improve the performance and security.
Cyber-physical system integrity requires both hardware and software security. Many of the cyber attacks are successful as they are designed to selectively target a specific hardware or software component in an embedded system and trigger its failure. Existing security measures also use attack vector models and isolate the malicious component as a counter-measure. Isolated security primitives do not provide the overall trust required in an embedded system. Trust enhancements are proposed to a hardware security platform, where the trust specifications are implemented in both software and hardware. This distribution of trust makes it difficult for a hardware-only or software-only attack to cripple the system. The proposed approach is applied to a smart grid application consisting of third-party soft IP cores, where an attack on this module can result in a blackout. System integrity is preserved in the event of an attack and the anomalous behavior of the IP core is recorded by a supervisory module. The IP core also provides a snapshot of its trust metric, which is logged for further diagnostics.
In this paper, the design of an event-driven middleware for general purpose services in smart grid (SG) is presented. The main purpose is to provide a peer-to-peer distributed software infrastructure to allow the access of new multiple and authorized actors to SGs information in order to provide new services. To achieve this, the proposed middleware has been designed to be: 1) event-based; 2) reliable; 3) secure from malicious information and communication technology attacks; and 4) to enable hardware independent interoperability between heterogeneous technologies. To demonstrate practical deployment, a numerical case study applied to the whole U.K. distribution network is presented, and the capabilities of the proposed infrastructure are discussed.
In this paper we discuss several improvements to the security and reliability of a classic Bluetooth network (piconet) that can arise from the fact of being able to transmit the same frame with two frequencies on each slot, instead of the actual standard, that uses only one frequency. Furthermore, we build upon this possibility and we show that piconet participants can explore many strategies to increase the security of their communications by confounding eavesdroppers, such as multiple hopping sequences, random selection of a hopping sequence on each transmission slot and variable frame encryption per hopping sequence. Finally, all this can be decided independently by any piconet participant without having to agree in real time on some type of service with other participants of the same piconet.
In Wireless Sensor Networks (WSNs), data aggregation has been used to reduce bandwidth and energy costs during a data collection process. However, data aggregation, while bringing us the benefit of improving bandwidth usage and energy efficiency, also introduces opportunities for security attacks, thus reducing data delivery reliability. There is a trade-off between bandwidth and energy efficiency and achieving data delivery reliability. In this paper, we present a comparative study on the reliability and efficiency characteristics of different data aggregation approaches using both simulation studies and test bed evaluations. We also analyse the factors that contribute to network congestion and affect data delivery reliability. Finally, we investigate an optimal trade-off between reliability and efficiency properties of the different approaches by using an intermediate approach, called Multi-Aggregator based Multi-Cast (MAMC) data aggregation approach. Our evaluation results for MAMC show that it is possible to achieve reliability and efficiency at the same time.
At the core of the "Big Data" revolution lie frameworks and systems that allow for the massively parallel processing of large amounts of data. Ironically, while they have been designed for processing large amounts of data, these systems are at the same time major producers of data: to support the administration and management of these huge-scale systems, they are configured to generate detailed log and monitoring data, periodically capturing the system state across all nodes, components and jobs in the system. While such logging information is used routinely by sysadmins for ad-hoc trouble-shooting and problem diagnosis, we point out that there is a tremendous value in analyzing such data from a research point of view. In this talk, we will go over several case studies that demonstrate how measuring and analyzing measurement data from production systems can provide new insights into how systems work and fail, and how these new insights can help in designing better systems.
Past generations of software developers were well on the way to building a software engineering mindset/gestalt, preferring tools and techniques that concentrated on safety, security, reliability, and code re-usability. Computing education reflected these priorities and was, to a great extent organized around these themes, providing beginning software developers a basis for professional practice. In more recent times, economic and deadline pressures and the de-professionalism of practitioners have combined to drive a development agenda that retains little respect for quality considerations. As a result, we are now deep into a new and severe software crisis. Scarcely a day passes without news of either a debilitating data or website hack, or the failure of a mega-software project. Vendors, individual developers, and possibly educators can anticipate an equally destructive flood of malpractice litigation, for the argument that they systematically and recklessly ignored known best development practice of long standing is irrefutable. Yet we continue to instruct using methods and to employ development tools we know, or ought to know, are inherently insecure, unreliable, and unsafe, and that produce software of like ilk. The authors call for a renewed professional and educational focus on software quality, focusing on redesigned tools that enable and encourage known best practice, combined with reformed educational practices that emphasize writing human readable, safe, secure, and reliable software. Practitioners can only deploy sound management techniques, appropriate tool choice, and best practice development methodologies such as thorough planning and specification, scope management, factorization, modularity, safety, appropriate team and testing strategies, if those ideas and techniques are embedded in the curriculum from the beginning. The authors have instantiated their ideas in the form of their highly disciplined new version of Niklaus Wirth's 1980s Modula-2 programming notation under the working moniker Modula-2 R10. They are now working on an implementation that will be released under a liberal open source license in the hope that it will assist in reforming the CS curriculum around a best practices core so as to empower would-be professionals with the intellectual and practical mindset to begin resolving the software crisis. They acknowledge there is no single software engineering silver bullet, but assert that professional techniques can be inculcated throughout a student's four-year university tenure, and if implemented in the workplace, these can greatly reduce the likelihood of multiplied IT failures at the hands of our graduates. The authors maintain that professional excellence is a necessary mindset, a habit of self-discipline that must be intentionally embedded in all aspects of one's education, and subsequently drive all aspects of one's practice, including, but by no means limited to, the choice and use of programming tools.
The Internet of Things (IoT) is a design implementation of embedded system design that connects a variety of devices, sensors, and physical objects to a larger connected network (e.g. the Internet) which requires human-to-human or human-to-computer interaction. While the IoT is expected to expand the user's connectivity and everyday convenience, there are serious security considerations that come into account when using the IoT for distributed authentication. Furthermore the incorporation of biometrics to IoT design brings about concerns of cost and implementing a 'user-friendly' design. In this paper, we focus on the use of electrocardiogram (ECG) signals to implement distributed biometrics authentication within an IoT system model. Our observations show that ECG biometrics are highly reliable, more secure, and easier to implement than other biometrics.
Cryptographic systems are vulnerable to random errors and injected faults. Soft errors can inadvertently happen in critical cryptographic modules and attackers can inject faults into systems to retrieve the embedded secret. Different schemes have been developed to improve the security and reliability of cryptographic systems. As the new SHA-3 standard, Keccak algorithm will be widely used in various cryptographic applications, and its implementation should be protected against random errors and injected faults. In this paper, we devise different parity checking methods to protect the operations of Keccak. Results show that our schemes can be easily implemented and can effectively protect Keccak system against random errors and fault attacks.
Software defined networking (SDN) is an emerging technology for controlling flows through networks. Used in the context of industrial control systems, an objective is to design configurations that have built-in protection for hardware failures in the sense that the configuration has "baked-in" back-up routes. The objective is to leave the configuration static as long as possible, minimizing the need to have the controller push in new routing and filtering rules We have designed and implemented a tool that enables us to determine the complete connectivity map from an analysis of all switch configurations in the network. We can use this tool to explore the impact of a link failure, in particular to determine whether the failure induces loss of the ability to deliver a flow even after the built-in back-up routes are used. A measure of the original configuration's resilience to link failure is the mean number of link failures required to induce the first such loss of service. The computational cost of each link failure and subsequent analysis is large, so there is much to be gained by reducing the overall cost of obtaining a statistically valid estimate of resiliency. This paper shows that when analysis of a network state can identify all as-yet-unfailed links any one of whose failure would induce loss of a flow, then we can use the technique of importance sampling to estimate the mean number of links required to fail before some flow is lost, and analyze the potential for reducing the variance of the sample statistic. We provide both theoretical and empirical evidence for significant variance reduction.
In recent years, The vulnerability of agricultural products chain is been exposed because of the endlessly insecure events appeared in every areas and every degrees from the natural disasters on the each node operation of agricultural products supply chain in recently years. As an very important place of HUNAN Province because of its abundant agricultural products, the Eastern Area's security in agricultural products supply chain was related to the safety and stability of economic development in the entire region. In order to make the more objective, scientific, practical of risk management in the empirical analysis, This item is based on the AHP-FCS method to deal with the qualitative to quantitative analysis about risk management of agricultural product supply chain, to identify and evaluate the probability and severity of all the risk possibility.
Exhaustive enumeration of a S-select-k problem for hypothesized substations outages can be practically infeasible due to exponential growth of combinations as both S and k numbers increase. This enumeration of worst-case substations scenarios from the large set, however, can be improved based on the initial selection sets with the root nodes and segments. In this paper, the previous work of the reverse pyramid model (RPM) is enhanced with prioritization of root nodes and defined segmentations of substation list based on mean-time-to-compromise (MTTC) value that is associated with each substation. Root nodes are selected based on the threshold values of the substation ranking on MTTC values and are segmented accordingly from the root node set. Each segmentation is then being enumerated with S-select-k module to identify worst-case scenarios. The lowest threshold value on the list, e.g., a substation with no assignment of MTTC or extremely low number, is completely eliminated. Simulation shows that this approach demonstrates similar outcome of the risk indices among all randomly generated MTTC of the IEEE 30-bus system.