Biblio
Many emerging applications, from domains such as healthcare and oil & gas, require several data processing systems for complex analytics. This demo paper showcases system, a framework that provides multi-platform task execution for such applications. It features a three-layer data processing abstraction and a new query optimization approach for multi-platform settings. We will demonstrate the strengths of system by using real-world scenarios from three different applications, namely, machine learning, data cleaning, and data fusion.
Learning classifier systems (LCSs) are rule-based evolutionary algorithms uniquely suited to classification and data mining in complex, multi-factorial, and heterogeneous problems. LCS rule fitness is commonly based on accuracy, but this metric alone is not ideal for assessing global rule `value' in noisy problem domains, and thus impedes effective knowledge extraction. Multi-objective fitness functions are promising but rely on knowledge of how to weigh objective importance. Prior knowledge would be unavailable in most real-world problems. The Pareto-front concept offers a strategy for multi-objective machine learning that is agnostic to objective importance. We propose a Pareto-inspired multi-objective rule fitness (PIMORF) for LCS, and combine it with a complimentary rule-compaction approach (SRC). We implemented these strategies in ExSTraCS, a successful supervised LCS and evaluated performance over an array of complex simulated noisy and clean problems (i.e. genetic and multiplexer) that each concurrently model pure interaction effects and heterogeneity. While evaluation over multiple performance metrics yielded mixed results, this work represents an important first step towards efficiently learning complex problem spaces without the advantage of prior problem knowledge. Overall the results suggest that PIMORF paired with SRC improved rule set interpretability, particularly with regard to heterogeneous patterns.
There are more and more systems using mobile devices to perform sensing tasks, but these increase the risk of leakage of personal privacy and data. Data hiding is one of the important ways for information security. Even though many data hiding algorithms have worked on providing more hiding capacity or higher PSNR, there are few algorithms that can control PSNR effectively while ensuring hiding capacity. In this paper, with controllable PSNR based on LSBs substitution- PSNR-Controllable Data Hiding (PCDH), we first propose a novel encoding plan for data hiding. In PCDH, we use the remainder algorithm to calculate the hidden information, and hide the secret information in the last x LSBs of every pixel. Theoretical proof shows that this method can control the variation of stego image from cover image, and control PSNR by adjusting parameters in the remainder calculation. Then, we design the encoding and decoding algorithms with low computation complexity. Experimental results show that PCDH can control the PSNR in a given range while ensuring high hiding capacity. In addition, it can resist well some steganalysis. Compared to other algorithms, PCDH achieves better tradeoff among PSNR, hiding capacity, and computation complexity.
In this paper, we study the problem of privacy information leakage in a smart grid. The privacy risk is assumed to be caused by an unauthorized binary hypothesis testing of the consumer's behaviour based on the smart meter readings of energy supplies from the energy provider. Another energy supplies are produced by an alternative energy source. A controller equipped with an energy storage device manages the energy inflows to satisfy the energy demand of the consumer. We study the optimal energy control strategy which minimizes the asymptotic exponential decay rate of the minimum Type II error probability in the unauthorized hypothesis testing to suppress the privacy risk. Our study shows that the cardinality of the energy supplies from the energy provider for the optimal control strategy is no more than two. This result implies a simple objective of the optimal energy control strategy. When additional side information is available for the adversary, the optimal control strategy and privacy risk are compared with the case of leaking smart meter readings to the adversary only.
In this paper, a new method for quantitative evaluation of the security of cyber-physical systems (CPSs) is proposed. The proposed method models the different classes of adversarial attacks against CPSs, including cross-domain attacks, i.e., cyber-to-cyber and cyber-to-physical attacks. It also takes the secondary consequences of attacks on CPSs into consideration. The intrusion process of attackers has been modeled using attack graph and the consequence estimation process of the attack has been investigated using process model. The security attributes and the special parameters involved in the security analysis of CPSs, have been identified and considered. The quantitative evaluation has been done using the probability of attacks, time-to-shutdown of the system and security risks. The validation phase of the proposed model is performed as a case study by applying it to a boiling water power plant and estimating the suitable security measures.
Strecth Processing (SP) is a radar signal processing technique that provides high-range resolution with processing large bandwidth signals with lower rate Analog to Digital Converter(ADC)s. The range resolution of the large bandwidth signal is obtained through looking into a limited range window and low rate ADC samples. The target space in the observed range window is sparse and Compressive sensing(CS) is an important tool to further decrease the number of measurements and sparsely reconstruct the target space for sparse scenes with a known basis which is the Fourier basis in the general application of SP. Although classical CS techniques might be directly applied to SP, due to off-grid targets reconstruction performance degrades. In this paper, applicability of compressive sensing framework and its sparse signal recovery techniques to stretch processing is studied considering off-grid cases. For sparsity based robust SP, Perturbed Parameter Orthogonal Matching Pursuit(PPOMP) algorithm is proposed. PPOMP is an iterative technique that estimates off-grid target parameters through a gradient descent. To compute the error between actual and reconstructed parameters, Earth Movers Distance(EMD) is used. Performance of proposed algorithm are compared with classical CS and SP techniques.
This paper reviews some existing Speech Enhancement techniques and also proposes a new method for enhancing the speech by combining Compressed Sensing and Kalman filter approaches. This approach is based on reconstruction of noisy speech signal using Compressive Sampling Matching Pursuit (CoSaMP) algorithm and further enhanced by Kalman filter. The performance of the proposed method is evaluated and compared with that of the existing techniques in terms of intelligibility and quality measure parameters of speech. The proposed algorithm shows an improved performance compared to Spectral Subtraction, MMSE, Wiener filter, Signal Subspace, Kalman filter in terms of WSS, LLR, SegSNR, SNRloss, PESQ and overall quality.
Despite its great importance, modern network infrastructure is remarkable for the lack of rigor in its engineering. The Internet, which began as a research experiment, was never designed to handle the users and applications it hosts today. The lack of formalization of the Internet architecture meant limited abstractions and modularity, particularly for the control and management planes, thus requiring for every new need a new protocol built from scratch. This led to an unwieldy ossified Internet architecture resistant to any attempts at formal verification and to an Internet culture where expediency and pragmatism are favored over formal correctness. Fortunately, recent work in the space of clean slate Internet design-in particular, the software defined networking (SDN) paradigm-offers the Internet community another chance to develop the right kind of architecture and abstractions. This has also led to a great resurgence in interest of applying formal methods to specification, verification, and synthesis of networking protocols and applications. In this paper, we present a self-contained tutorial of the formidable amount of work that has been done in formal methods and present a survey of its applications to networking.
Firewall policies are notorious for having misconfiguration errors which can defeat its intended purpose of protecting hosts in the network from malicious users. We believe this is because today's firewall policies are mostly monolithic. Inspired by ideas from modular programming and code refactoring, in this work we introduce three kinds of modules: primary, auxiliary, and template, which facilitate the refactoring of a firewall policy into smaller, reusable, comprehensible, and more manageable components. We present algorithms for generating each of the three modules for a given legacy firewall policy. We also develop ModFP, an automated tool for converting legacy firewall policies represented in access control list to their modularized format. With the help of ModFP, when examining several real-world policies with sizes ranging from dozens to hundreds of rules, we were able to identify subtle errors.
To help establish a more scientific basis for security science, which will enable the development of fundamental theories and move the field from being primarily reactive to primarily proactive, it is important for research results to be reported in a scientifically rigorous manner. Such reporting will allow for the standard pillars of science, namely replication, meta-analysis, and theory building. In this paper we aim to establish a baseline of the state of scientific work in security through the analysis of indicators of scientific research as reported in the papers from the 2015 IEEE Symposium on Security and Privacy. To conduct this analysis, we developed a series of rubrics to determine the completeness of the papers relative to the type of evaluation used (e.g. case study, experiment, proof). Our findings showed that while papers are generally easy to read, they often do not explicitly document some key information like the research objectives, the process for choosing the cases to include in the studies, and the threats to validity. We hope that this initial analysis will serve as a baseline against which we can measure the advancement of the science of security.
As smart meters continue to be deployed around the world collecting unprecedented levels of fine-grained data about consumers, we need to find mechanisms that are fair to both, (1) the electric utility who needs the data to improve their operations, and (2) the consumer who has a valuation of privacy but at the same time benefits from sharing consumption data. In this paper we address this problem by proposing privacy contracts between electric utilities and consumers with the goal of maximizing the social welfare of both. Our mathematical model designs an optimization problem between a population of users that have different valuations on privacy and the costs of operation by the utility. We then show how contracts can change depending on the probability of a privacy breach. This line of research can help inform not only current but also future smart meter collection practices.
Firewall policies are notorious for having misconfiguration errors which can defeat its intended purpose of protecting hosts in the network from malicious users. We believe this is because today's firewall policies are mostly monolithic. Inspired by ideas from modular programming and code refactoring, in this work we introduce three kinds of modules: primary, auxiliary, and template, which facilitate the refactoring of a firewall policy into smaller, reusable, comprehensible, and more manageable components. We present algorithms for generating each of the three modules for a given legacy firewall policy. We also develop ModFP, an automated tool for converting legacy firewall policies represented in access control list to their modularized format. With the help of ModFP, when examining several real-world policies with sizes ranging from dozens to hundreds of rules, we were able to identify subtle errors.
Eight hundred eighty-seven phishing emails from Arizona State University, Brown University, and Cornell University were assessed by two reviewers for Cialdini’s six principles of persuasion: authority, social proof, liking/similarity, commitment/consistency, scarcity, and reciprocation. A correlational analysis of email characteristics by year revealed that the persuasion principles of commitment/consistency and scarcity have increased over time, while the principles of reciprocation and social proof have decreased over time. Authority and liking/similarity revealed mixed results with certain characteristics increasing and others decreasing. Results from this study can inform user training of phishing emails and help cybersecurity software to become more effective.
Applications such as fleet management and logistics, emergency response, public security and surveillance or mobile workforce management use geo-positioning and mobile networks as means of enabling real-time monitoring, communication and collaboration among a possibly large set of mobile nodes. The majority of those systems require real-time tracking of mobile nodes (e.g. vehicles, people or mobile robots), reliable communication to/from the nodes, as well as group communication among the mobile nodes. In this paper we describe a distributed middleware with focus on management of context-defined groups of mobile nodes, and group communication with large sets of nodes. We also present a prototype Fleet Tracking and Management system based on our middleware, give an example of how context-specific group communication can enhance the node's mutual awareness, and show initial performance results that indicate small overhead and latency of the group communication and management.
Using heterogeneous clouds has been considered to improve performance of big-data analytics for healthcare platforms. However, the problem of the delay when transferring big-data over the network needs to be addressed. The purpose of this paper is to analyze and compare existing cloud computing environments (PaaS, IaaS) in order to implement middleware services. Understanding the differences and similarities between cloud technologies will help in the interconnection of healthcare platforms. The paper provides a general overview of the techniques and interfaces for cloud computing middleware services, and proposes a cloud architecture for healthcare. Cloud middleware enables heterogeneous devices to act as data sources and to integrate data from other healthcare platforms, but specific APIs need to be developed. Furthermore, security and management problems need to be addressed, given the heterogeneous nature of the communication and computing environment. The present paper fills a gap in the electronic healthcare register literature by providing an overview of cloud computing middleware services and standardized interfaces for the integration with medical devices.
Lo et al. (2011) proposed an efficient key assignment scheme for access control in a large leaf class hierarchy where the alternations in leaf classes are more frequent than in non-leaf classes in the hierarchy. Their scheme is based on the public-key cryptosystem and hash function where operations like modular exponentiations are very much costly compared to symmetric-key encryptions and decryptions, and hash computations. Their scheme performs better than the previously proposed schemes. However, in this paper, we show that Lo et al.’s scheme fails to preserve the forward security property where a security class can also derive the secret keys of its successor classes ’s even after deleting the security class from the hierarchy. We aim to propose a new key management scheme for dynamic access control in a large leaf class hierarchy, which makes use of symmetric-key cryptosystem and one-way hash function. We show that our scheme requires significantly less storage and computational overheads as compared to Lo et al.’s scheme and other related schemes. Through the informal and formal security analysis, we further show that our scheme is secure against all possible attacks including the forward security. In addition, our scheme supports efficiently dynamic access control problems compared to Lo et al.’s scheme and other related schemes. Thus, higher security along with low storage and computational costs make our scheme more suitable for practical applications compared to other schemes.
Existing main-memory hash join algorithms for multi-core can be classified into two camps. Hardware-oblivious hash join variants do not depend on hardware-specific parameters. Rather, they consider qualitative characteristics of modern hardware and are expected to achieve good performance on any technologically similar platform. The assumption behind these algorithms is that hardware is now good enough at hiding its own limitations-through automatic hardware prefetching, out-of-order execution, or simultaneous multi-threading (SMT)-to make hardware-oblivious algorithms competitive without the overhead of carefully tuning to the underlying hardware. Hardware-conscious implementations, such as (parallel) radix join, aim to maximally exploit a given architecture by tuning the algorithm parameters (e.g., hash table sizes) to the particular features of the architecture. The assumption here is that explicit parameter tuning yields enough performance advantages to warrant the effort required. This paper compares the two approaches under a wide range of workloads (relative table sizes, tuple sizes, effects of sorted data, etc.) and configuration parameters (VM page sizes, number of threads, number of cores, SMT, SIMD, prefetching, etc.). The results show that hardware-conscious algorithms generally outperform hardware-oblivious ones. However, on specific workloads and special architectures with aggressive simultaneous multi-threading, hardware-oblivious algorithms are competitive. The main conclusion of the paper is that, in existing multi-core architectures, it is still important to carefully tailor algorithms to the underlying hardware to get the necessary performance. But processor developments may require to revisit this conclusion in the future.
One Time Password which is fixed length strings to perform authentication in electronic media is used as a one-time. In this paper, One Time Password production methods which based on hash functions were investigated. Keccak digest algorithm was used for the production of One Time Password. This algorithm has been selected as the latest standards for hash algorithm in October 2012 by National Instute of Standards and Technology. This algorithm is preferred because it is faster and safer than the others. One Time Password production methods based on hash functions is called Hashing-Based Message Authentication Code structure. In these structures, the key value is using with the hash function to generate the Hashing-Based Message Authentication Code value. Produced One Time Password value is based on the This value. In this application, the length of the value One Time Password was the eight characters to be useful in practice.
A novel short-time Fourier transform (STFT) domain adaptive filtering scheme is proposed that can be easily combined with nonlinear post filters such as residual echo or noise reduction in acoustic echo cancellation. Unlike normal STFT subband adaptive filters, which suffers from aliasing artifacts due to its poor prototype filter, our scheme achieves good accuracy by exploiting the relationship between the linear convolution and the poor prototype filter, i.e., the STFT window function. The effectiveness of our scheme was confirmed through the results of simulations conducted to compare it with conventional methods.
This paper presents a novel visual analytics technique developed to support exploratory search tasks for event data document collections. The technique supports discovery and exploration by clustering results and overlaying cluster summaries onto coordinated timeline and map views. Users can also explore and interact with search results by selecting clusters to filter and re-cluster the data with animation used to smooth the transition between views. The technique demonstrates a number of advantages over alternative methods for displaying and exploring geo-referenced search results and spatio-temporal data. Firstly, cluster summaries can be presented in a manner that makes them easy to read and scan. Listing representative events from each cluster also helps the process of discovery by preserving the diversity of results. Also, clicking on visual representations of geo-temporal clusters provides a quick and intuitive way to navigate across space and time simultaneously. This removes the need to overload users with the display of too many event labels at any one time. The technique was evaluated with a group of nineteen users and compared with an equivalent text based exploratory search engine.
Recent events have brought to light the increasingly intertwined nature of modern infrastructures. As a result much effort is being put towards protecting these vital infrastructures without which modern society suffers dire consequences. These infrastructures, due to their intricate nature, behave in complex ways. Improving their resilience and understanding their behavior requires a collaborative effort between the private sector that operates these infrastructures and the government sector that regulates them. This collaboration in the form of information sharing requires a new type of information network whose goal is in two parts to enable infrastructure operators share status information among interdependent infrastructure nodes and also allow for the sharing of vital information concerning threats and other contingencies in the form of alerts. A communication model that meets these requirements while maintaining flexibility and scalability is presented in this paper.