Biblio
Texting while driving has emerged as a significant threat to citizen safety. In this study, we utilize general deterrence theory (GDT), protection motivation theory and personality traits to evaluate texting while driving (TWD) compliance intentions among teenage drivers. This paper presents the results of our pilot study. We administered an online survey to 105 teenage and young adult drivers. The potential implications for research and practice and policy are discussed.
This study examines the effectiveness of virtual reality technology at creating an immersive user experience in which participants experience first hand the extreme negative consequences of smartphone use while driving. Research suggests that distracted driving caused by smartphones is related to smartphone addiction and causes fatalities. Twenty-two individuals participated in the virtual reality user experience (VRUE) in which they were asked to drive a virtual car using a Oculus Rift headset, LeapMotion hand tracking device, and a force feedback steering wheel and pedals. While driving in the simulation participants were asked to interact with a smartphone and after a period of time trying to manage both tasks a vehicle appears before them and they are involved in a head-on collision. Initial results indicated a strong sense of presence was felt by participants and a change or re-enforcement of the participant's perception of the dangers of smartphone use while driving was observed.
In today's world, the security of companies' data is given a very big emphasis than ever. Despite huge investments made by companies to keep their systems safe, there are many information systems security breaches that infiltrate companies' systems and consequently affect their economic capacity, reputation, and customers' confidence. The literature suggests that almost all investments in information systems security have been focused only on technological solutions. However, having this partial view on the complex information systems security problem is found to be insufficient and hence there is an increasing call for researchers to include social factors into the solution space. One of such social factor is culture. Thus, in this research we studied how national culture influence employees' intention to violate or comply their company ISS policy. We construct and test an empirical model by using a survey data obtained from employees who are working in Ethiopia.
Smart Spaces are composed of heterogeneous sensors and devices that collect and share information. This information may contain personal information of the users. Thus, securing the data and preserving the privacy are of paramount importance. In this paper, we propose techniques for information security and privacy protection for Smart Spaces based on the Smart-M3 platform. We propose a) a security framework, and b) a context-aware role-based access control scheme. We model our access control scheme using ontological techniques and Web Ontology Language (OWL), and implement it via CLIPS rules. To evaluate the efficiency of our access control scheme, we measure the time it takes to check the access rights of the access requests. The results demonstrate that the highest response time is approximately 0.2 seconds in a set of 100000 triples. We conclude that the proposed access control scheme produces low overhead and is therefore, an efficient approach for Smart Spaces.
We introduce a Markov-model-based framework for Moving Target Defense (MTD) analysis. The framework allows modeling of a broad range of MTD strategies, provides general theorems about how the probability of a successful adversary defeating an MTD strategy is related to the amount of time/cost spent by the adversary, and shows how a multilevel composition of MTD strategies can be analyzed by a straightforward combination of the analysis for each one of these strategies. Within the proposed framework we define the concept of security capacity which measures the strength or effectiveness of an MTD strategy: the security capacity depends on MTD specific parameters and more general system parameters. We apply our framework to two concrete MTD strategies.
We study the value of data privacy in a game-theoretic model of trading private data, where a data collector purchases private data from strategic data subjects (individuals) through an incentive mechanism. The private data of each individual represents her knowledge about an underlying state, which is the information that the data collector desires to learn. Different from most of the existing work on privacy-aware surveys, our model does not assume the data collector to be trustworthy. Then, an individual takes full control of its own data privacy and reports only a privacy-preserving version of her data. In this paper, the value of ε units of privacy is measured by the minimum payment of all nonnegative payment mechanisms, under which an individual's best response at a Nash equilibrium is to report the data with a privacy level of ε. The higher ε is, the less private the reported data is. We derive lower and upper bounds on the value of privacy which are asymptotically tight as the number of data subjects becomes large. Specifically, the lower bound assures that it is impossible to use less amount of payment to buy ε units of privacy, and the upper bound is given by an achievable payment mechanism that we designed. Based on these fundamental limits, we further derive lower and upper bounds on the minimum total payment for the data collector to achieve a given learning accuracy target, and show that the total payment of the designed mechanism is at most one individual's payment away from the minimum.
Maintaining and updating signature databases is a tedious task that normally requires a large amount of user effort. The problem becomes harder when features can be distorted by observation noise, which we call volatility. To address this issue, we propose algorithms and models to automatically generate signatures in the presence of noise, with a focus on stack fingerprinting, which is a research area that aims to discover the operating system (OS) of remote hosts using TCP/IP packets. Armed with this framework, we construct a database with 420 network stacks, label the signatures, develop a robust classifier for this database, and fingerprint 66M visible webservers on the Internet.
In this paper, we introduce Entropy/IP: a system that discovers Internet address structure based on analyses of a subset of IPv6 addresses known to be active, i.e., training data, gleaned by readily available passive and active means. The system is completely automated and employs a combination of information-theoretic and machine learning techniques to probabilistically model IPv6 addresses. We present results showing that our system is effective in exposing structural characteristics of portions of the active IPv6 Internet address space, populated by clients, services, and routers. In addition to visualizing the address structure for exploration, the system uses its models to generate candidate addresses for scanning. For each of 15 evaluated datasets, we train on 1K addresses and generate 1M candidates for scanning. We achieve some success in 14 datasets, finding up to 40% of the generated addresses to be active. In 11 of these datasets, we find active network identifiers (e.g., /64 prefixes or "subnets") not seen in training. Thus, we provide the first evidence that it is practical to discover subnets and hosts by scanning probabilistically selected areas of the IPv6 address space not known to contain active hosts a priori.
We introduce a model for differentially private analysis of weighted graphs in which the graph topology (υ,ε) is assumed to be public and the private information consists only of the edge weights ω : ε → R+. This can express hiding congestion patterns in a known system of roads. Differential privacy requires that the output of an algorithm provides little advantage, measured by privacy parameters ε and δ, for distinguishing between neighboring inputs, which are thought of as inputs that differ on the contribution of one individual. In our model, two weight functions w,w' are considered to be neighboring if they have l1 distance at most one. We study the problems of privately releasing a short path between a pair of vertices and of privately releasing approximate distances between all pairs of vertices. We are concerned with the approximation error, the difference between the length of the released path or released distance and the length of the shortest path or actual distance. For the problem of privately releasing a short path between a pair of vertices, we prove a lower bound of Ω(textbarυtextbar) on the additive approximation error for fixed privacy parameters ε,δ. We provide a differentially private algorithm that matches this error bound up to a logarithmic factor and releases paths between all pairs of vertices, not just a single pair. The approximation error achieved by our algorithm can be bounded by the number of edges on the shortest path, so we achieve better accuracy than the worst-case bound for pairs of vertices that are connected by a low-weight path consisting of o(textbarυtextbar) vertices. For the problem of privately releasing all-pairs distances, we show that for trees we can release all-pairs distances with approximation error \$O(log2.5textbarυtextbar) for fixed privacy parameters. For arbitrary bounded-weight graphs with edge weights in [0,M] we can brelease all distances with approximation error Õ(√textgreater(textbarυtextbarM).
Randomness is a vital resource for modern-day information processing, especially for cryptography. A wide range of applications critically rely on abundant, high-quality random numbers generated securely. Here, we show how to expand a random seed at an exponential rate without trusting the underlying quantum devices. Our approach is secure against the most general adversaries, and has the following new features: cryptographic level of security, tolerating a constant level of imprecision in devices, requiring only unit size quantum memory (for each device component) in an honest implementation, and allowing a large natural class of constructions for the protocol. In conjunction with a recent work by Chung et al. [2014], it also leads to robust unbounded expansion using just 2 multipart devices. When adapted for distributing cryptographic keys, our method achieves, for the first time, exponential expansion combined with cryptographic security and noise tolerance. The proof proceeds by showing that the Rényi divergence of the outputs of the protocol (for a specific bounding operator) decreases linearly as the protocol iterates. At the heart of the proof are a new uncertainty principle on quantum measurements and a method for simulating trusted measurements with untrusted devices.
Tracking and maintaining satisfactory QoE for video streaming services is becoming a greater challenge for mobile network operators than ever before. Downloading and watching video content on mobile devices is currently a growing trend among users, that is causing a demand for higher bandwidth and better provisioning throughout the network infrastructure. At the same time, popular demand for privacy has led many online streaming services to adopt end-to-end encryption, leaving providers with only a handful of indicators for identifying QoE issues. In order to address these challenges, we propose a novel methodology for detecting video streaming QoE issues from encrypted traffic. We develop predictive models for detecting different levels of QoE degradation that is caused by three key influence factors, i.e. stalling, the average video quality and the quality variations. The models are then evaluated on the production network of a large scale mobile operator, where we show that despite encryption our methodology is able to accurately detect QoE problems with 72\textbackslash%-92\textbackslash% accuracy, while even higher performance is achieved when dealing with cleartext traffic
Parallel discrete-event simulation (PDES) is an important tool in the codesign of extreme-scale systems because PDES provides a cost-effective way to evaluate designs of high-performance computing systems. Optimistic synchronization algorithms for PDES, such as Time Warp, allow events to be processed without global synchronization among the processing elements. A rollback mechanism is provided when events are processed out of timestamp order. Although optimistic synchronization protocols enable the scalability of large-scale PDES, the performance of the simulations must be tuned to reduce the number of rollbacks and provide an improved simulation runtime. To enable efficient large-scale optimistic simulations, one has to gain insight into the factors that affect the rollback behavior and simulation performance. We developed a tool for ROSS model developers that gives them detailed metrics on the performance of their large-scale optimistic simulations at varying levels of simulation granularity. Model developers can use this information for parameter tuning of optimistic simulations in order to achieve better runtime and fewer rollbacks. In this work, we instrument the ROSS optimistic PDES framework to gather detailed statistics about the simulation engine. We have also developed an interactive visualization interface that uses the data collected by the ROSS instrumentation to understand the underlying behavior of the simulation engine. The interface connects real time to virtual time in the simulation and provides the ability to view simulation data at different granularities. We demonstrate the usefulness of our framework by performing a visual analysis of the dragonfly network topology model provided by the CODES simulation framework built on top of ROSS. The instrumentation needs to minimize overhead in order to accurately collect data about the simulation performance. To ensure that the instrumentation does not introduce unnecessary overhead, we perform a scaling study that compares instrumented ROSS simulations with their noninstrumented counterparts in order to determine the amount of perturbation when running at different simulation scales.
In the capacitated k-median (CKM) problem, we are given a set F of facilities, each facility i ∈ F with a capacity ui, a set C of clients, a metric d over F ∪ C and an integer k. The goal is to open k facilities in F and connect the clients C to the open facilities such that each facility i is connected by at most ui clients, so as to minimize the total connection cost. In this paper, we give the first constant approximation for CKM, that only violates the cardinality constraint by a factor of 1 + ε. This generalizes the result of [Li15], which only works for the uniform capacitated case. Moreover, the approximation ratio we obtain is O([EQUATION] log [EQUATION]), which is an exponential improvement over the ratio of exp (O([EQUATION])) in [Li15]. The natural LP relaxation for the problem, which almost all previous algorithms for CKM are based on, has unbounded integrality gap even if (2 – ε)k facilities can be opened. We introduce a novel configuration LP for the problem, that overcomes this integrality gap. On the downside, each facility may be opened twice by our algorithm.
This article derives trade-offs between three basic costs of a parallel algorithm: synchronization, data movement, and computational cost. These trade-offs are lower bounds on the execution time of the algorithm that are independent of the number of processors but dependent on the problem size. Therefore, they provide lower bounds on the execution time of any parallel schedule of an algorithm computed by a system composed of any number of homogeneous processors, each with associated computational, communication, and synchronization costs. We employ a theoretical model that measures the amount of work and data movement as a maximum over that incurred along any execution path during the parallel computation. By considering this metric rather than the total communication volume over the whole machine, we obtain new insights into the characteristics of parallel schedules for algorithms with nontrivial dependency structures. We also present reductions from BSP and LogGP algorithms to our execution model, extending our lower bounds to these two models of parallel computation. We first develop our results for general dependency graphs and hypergraphs based on their expansion properties, and then we apply the theorem to a number of specific algorithms in numerical linear algebra, namely triangular substitution, Cholesky factorization, and stencil computations. We represent some of these algorithms as families of dependency graphs. We derive their communication lower bounds by studying the communication requirements of the hypergraph structures shared by these dependency graphs. In addition to these lower bounds, we introduce a new communication-efficient parallelization for stencil computation algorithms, which is motivated by results of our lower bound analysis and the properties of previously existing parallelizations of the algorithms.
Ever-growing performance of supercomputers nowadays brings demanding requirements of energy efficiency and resilience, due to rapidly expanding size and duration in use of the large-scale computing systems. Many application/architecture-dependent parameters that determine energy efficiency and resilience individually have causal effects with each other, which directly affect the trade-offs among performance, energy efficiency and resilience at scale. To enable high-efficiency management for large-scale High-Performance Computing (HPC) systems nowadays, quantitatively understanding the entangled effects among performance, energy efficiency, and resilience is thus required. While previous work focuses on exploring energy-saving and resilience-enhancing opportunities separately, little has been done to theoretically and empirically investigate the interplay between energy efficiency and resilience at scale. In this article, by extending the Amdahl’s Law and the Karp-Flatt Metric, taking resilience into consideration, we quantitatively model the integrated energy efficiency in terms of performance per Watt and showcase the trade-offs among typical HPC parameters, such as number of cores, frequency/voltage, and failure rates. Experimental results for a wide spectrum of HPC benchmarks on two HPC systems show that the proposed models are accurate in extrapolating resilience-aware performance and energy efficiency, and capable of capturing the interplay among various energy-saving and resilience factors. Moreover, the models can help find the optimal HPC configuration for the highest integrated energy efficiency, in the presence of failures and applied resilience techniques.
The optimal design of a fault-tolerant quantum computer involves finding an appropriate balance between the burden of large-scale integration of noisy components and the load of improving the reliability of hardware technology. This balance can be evaluated by quantitatively modeling the execution of quantum logic operations on a realistic quantum hardware containing limited computational resources. In this work, we report a complete performance simulation software tool capable of (1) searching the hardware design space by varying resource architecture and technology parameters, (2) synthesizing and scheduling a fault-tolerant quantum algorithm within the hardware constraints, (3) quantifying the performance metrics such as the execution time and the failure probability of the algorithm, and (4) analyzing the breakdown of these metrics to highlight the performance bottlenecks and visualizing resource utilization to evaluate the adequacy of the chosen design. Using this tool, we investigate a vast design space for implementing key building blocks of Shor’s algorithm to factor a 1,024-bit number with a baseline budget of 1.5 million qubits. We show that a trapped-ion quantum computer designed with twice as many qubits and one-tenth of the baseline infidelity of the communication channel can factor a 2,048-bit integer in less than 5 months.
This work studies applications and generalizations of a simple estimation technique that provides exponential concentration under heavy-tailed distributions, assuming only bounded low-order moments. We show that the technique can be used for approximate minimization of smooth and strongly convex losses, and specifically for least squares linear regression. For instance, our d-dimensional estimator requires just O(d log(1/δ)) random samples to obtain a constant factor approximation to the optimal least squares loss with probability 1-δ, without requiring the covariates or noise to be bounded or subgaussian. We provide further applications to sparse linear regression and low-rank covariance matrix estimation with similar allowances on the noise and covariate distributions. The core technique is a generalization of the median-of-means estimator to arbitrary metric spaces.
An exploratory experiment found that sorting arrays of random integers using Java 8's parallel sort required only 50%-70% of the time taken using the parallel sort of the Parallel Colt library. Factors considered responsible for the performance advantage include the use of a dual-pivot quicksort on locally held data at certain phases of execution and work-stealing by threads, a feature of the fork-join framework. The default performance of Parallel Colt's parallel sort was found to degrade dramatically for small array sizes due to unnecessary thread creation.
Virtual reality allows users to experience unusual immersive environments. There are still several aspect of design for virtual reality that need more investigation, such as transitioning between environments. Multiple studies have shown that physical movement in a virtual environment supports immersion and presence. Our setup will allow the comparative study of the coupling of virtual camera movements with simultaneous physical movements of the user in terms of user preference and comfort. This work-in-progress uses a within-subject experimental design for evaluating interaction prototypes based on the Oculus Rift DK2 where participants will be tasked with transitioning between different environments; once using physical motion to merely trigger the transition and once with the virtual camera movement being coupled to the physical motion. Qualitative and quantitative data will be collected utilizing questionnaires and in-game metrics. Pretests of a similar setup were used to establish minimal levels of comfort.
Beginning the analysis of new data is often difficult as modern datasets can be overwhelmingly large. With visual analytics in particular, displays of large datasets quickly become crowded and unclear. Through observing the practices of analysts working with the event sequence visualization tool EventFlow, we identified three techniques to reduce initial visual complexity by reducing the number of event categories resulting in a simplified overview. For novice users, we suggest an initial pair of event categories to display. For advanced users, we provide six ranking metrics and display all pairs in a ranked list. Finally, we present the Event Category Matrix (ECM), which simultaneously displays overviews of every event category pair. In this work, we report on the development of these techniques through two formative usability studies and the improvements made as a result. The goal of our work is to investigate strategies that help users overcome the challenges associated with initial visual complexity and to motivate the use of simplified overviews in temporal event sequence analysis.
There are seemingly many advantages to being able to identify, document, test, and trace single or "atomic" requirements. Why then has there been little attention to the topic and no widely used definition or process on how to define atomic requirements? Definitions of requirements and standards focus on user needs, system capabilities or functions; some definitions include making individual requirements singular or without the use of conjunctions. In a few cases there has been a description of atomic system events or requirements. This work is surveyed here although there is no well accepted and used best practice for generating atomic requirements. Due to their importance in software engineering, quality and metrics for requirements have received considerable attention. In the seminal paper on software requirements quality, Davis et al. proposed specific metrics including the "unambiguous quality factor" and the "verifiable quality factor"; these and other metrics work best with a clearly enumerable list of single requirements. Atomic requirements are defined here as a natural language statement that completely describes a single system function, feature, need, or capability, including all information, details, limits, and characteristics. A typical user login screen is used as an example of an atomic requirement which can include both functional and nonfunctional requirements. Individual atomic requirements are supported by a system glossary, references to applicable industry standards, mock ups of the user interface, etc. One way to identify such atomic requirements is from use case or system event analysis. This definition of atomic requirements is still a work in progress and offered to prompt discussion. Atomic requirements allow clear naming or numbering of requirements for traceability, change management, and importance ranking. Further, atomic requirements defined in this manner are suitable for rapid implementation approaches (implementing one requirement at a time), enable good test planning (testing can clearly indicate pass or fail of the whole requirement), and offer other management advantages in project control.
Cloud computing provides a shared pool of resources for large-scale distributed applications. Recent trends such as fog computing and edge computing spread the workload of clouds closer towards the edge of the network and the users. Exploiting the edge resources efficiently requires managing the resources and directing user traffic to the correct edge servers. In this paper we propose to profile and group users according to their interest profiles. We consider edge caching as an example and through our evaluation show the potential benefits of directing users from the same group to the same caches. We investigate a range of workloads and parameters and the same conclusions apply. Our results highlight the importance of grouping users and demonstrate the potential benefits of this approach.
Vehicular users are expected to consume large amounts of data, for both entertainment and navigation purposes. This will put a strain on cellular networks, which will be able to cope with such a load only if proper caching is in place; this in turn begs the question of which caching architecture is the best-suited to deal with vehicular content consumption. In this paper, we leverage a large-scale, crowd-sourced trace to (i) characterize the vehicular traffic demand, in terms of overall magnitude and content breakup; (ii) assess how different caching approaches perform against such a real-world load; (iii) study the effect of recommendation systems and local content items. We define a price-of-fog metric, expressing the additional caching capacity to deploy when moving from traditional, centralized caching architectures to a "fog computing" approach, where caches are closer to the network edge. We find that for location-specific items, such as the ones that vehicular users are most likely to request, such a price almost disappears. Vehicular networks thus make a strong case for the adoption of mobile-edge caching, as we are able to reap the benefit thereof – including a reduction in the distance travelled by data, within the core network – with little or none of the associated disadvantages.
The exploitation of the opportunistic infrastructure via Device-to-Device (D2D) communication is a critical component towards the adoption of new paradigms such as edge and fog computing. While a lot of work has demonstrated the great potential of D2D communication, it is still unclear whether the benefits of the D2D approach can really be leveraged in practice. In this paper, we develop a software sensor, namely Detector, which senses the infrastructure in proximity of a mobile user. We analyze and evaluate D2D on the wild, i.e., not in simulations. We found that in a realistic environment, a mobile is always co-located in proximity to at least one other mobile device throughout the day. This suggests that a device can schedule tasks processing in coordination with other devices, potentially more powerful, instead of handling the processing of the tasks by itself.
In the Internet of Things (IoT), Internet-connected things provide an influx of data and resources that offer unlimited possibility for applications and services. Smart City IoT systems refer to the things that are distributed over wide physical areas covering a whole city. While the new breed of data and resources looks promising, building applications in such large scale IoT systems is a difficult task due to the distributed and dynamic natures of entities involved, such as sensing, actuating devices, people and computing resources. In this paper, we explore the process of developing Smart City IoT applications from a coordination-based perspective. We show that a distributed coordination model that oversees such a large group of distributed components is necessary in building Smart City IoT applications. In particular, we propose Adaptive Distributed Dataflow, a novel Dataflow-based programming model that focuses on coordinating city-scale distributed systems that are highly heterogeneous and dynamic.