Biblio
The wide presence of large graph data and the increasing popularity of storing data in the cloud drive the needs for graph query processing on a remote cloud. But a fundamental challenge is to process user queries without compromising sensitive information. This work focuses on privacy preserving subgraph matching in a cloud server. The goal is to minimize the overhead on both cloud and client sides for subgraph matching, without compromising users' sensitive information. To that end, we transform an original graph \$G\$ into a privacy preserving graph Gk, which meets the requirement of an existing privacy model known as k-automorphism. By making use of the symmetry in a k-automorphic graph, a subgraph matching query can be efficiently answered using a graph Go, a small subset of Gk. This approach saves both space and query cost in the cloud server. We also anonymize the query graphs to protect their label information using label generalization technique. To reduce the search space for a subgraph matching query, we propose a cost model to select the more effective label combinations. The effectiveness and efficiency of our method are demonstrated through extensive experimental results on real datasets.
Generic multi-button controllers are the most common input devices used for video games. In contrast, dedicated game controllers and gestural interactions increase immersion and playability. Room-sized gaming has opened up possibilities to further enhance the immersive experience, and provides players with opportunities to use full-body movements as input. We present a purpose-centric approach to appropriating everyday objects as physical game controllers, for immersive room-sized gaming. Virtual manipulations supported by such physical controllers mimic real-world function and usage. Doing so opens up new possibilities for interactions that flow seamlessly from the physical into the virtual world. As a proof-of-concept, we present a 'Tower Defence' styled game, that uses four everyday household objects as game controllers, each of which serves as a weapon to defend the base of the players from enemy bots. Players can use 1) a mop (or a broom) to sweep away enemy bots directionally; 2) a fan to scatter them away; 3) a vacuum cleaner to suck them; 4) a mouse trap to destroy them. Each controller is tracked using a motion capture system. A physics engine is integrated in the game, and ensures virtual objects act as though they are manipulated by the actual physical controller, thus providing players with a highly-immersive gaming experience.
Virtual Reality and immersive experiences, which allow players to share the same virtual environment as the characters of a virtual world, have gained more and more interest recently. In order to conceive these immersive virtual worlds, one of the challenges is to give to the characters that populate them the ability to express behaviors that can support the immersion. In this work, we propose a model capable of controlling and simulating a conversational group of social agents in an immersive environment. We describe this model which has been previously validated using a regular screen setting and we present a study for measuring whether users recognized the attitudes expressed by virtual agents through the realtime generated animations of nonverbal behavior in an immersive setting. Results mirrored those of the regular screen setting thus providing further insights for improving players experiences by integrating them into immersive simulated group conversations with characters that express different interpersonal attitudes.
In this paper we present work-in-progress toward a vision of personalized views of visual analytics interfaces in the context of collaborative analytics in immersive spaces. In particular, we are interested in the sense of immersion, responsiveness, and personalization afforded by gaze-based input. Through combining large screen visual analytics tools with eye-tracking, a collaborative visual analytics system can become egocentric while not disrupting the collaborative nature of the experience. We present a prototype system and several ideas for real-time personalization of views in visual analytics.
Many of the game-changing innovations the Internet brought and continues to bring to all of our daily professional and private lifes come with privacy-related costs. The more day-to-day activities are based on the Internet, the more personal data are generated, collected, stored and used. Big Data, Internet of Things, cyber-physical-systems and similar trends will be based on even more personal information all of us use and produce constantly. Three major points are to be noted here: First, there is no common European or even worldwide agreement whether and in how far these collections need to be limited. There is, though, no common privacy law âĂŞ neither in Europe nore worldwide. Second, laws that do exist constantly fail in steering the developments. Technology innovations come so fast, are so disruptive and so market-demand driven, that an ex-post control by law and courts constantly comes late and/or is circumvented and/or ignored. Third, lack of consensus and lack of steering lead to huge data accumulations and market monopolies built up very quickly and held by very few companies working on a global level with data driven business models. These early movers are in many cases in very dominant market positions making it not only more difficult to regulate their behavior but also to keep the markets open for future competitors. This workshop will evaluate current European and international attempts to deal with this situation. Although all four panelists have a legal background, the meeting will be less interested in an in-depth review of existing laws and their impact, but more in the underlying technological and ethical principles (and their inconsistencies) leading to the sitation described. Specific attention will be attributed to technology driven attempts to deal with the situation, such as privacy by design, privacy by default, usable privacy etc.
Software Defined Network(SDN) is a promising technological advancement in the networking world. It is still evolving and security is a major concern for SDN. In this paper we proposed policy based security architecture for securing the SDN domains. Our architecture enables the administrator to enforce different types of policies such as based on the devices, users, location and path for securing the communication in SDN domain. Our architecture is developed as an application that can be run on any of the SDN Controllers. We have implemented our architecture using the POX Controller and Raspberry Pi 2 switches. We will present different case scenarios to demonstrate fine granular security policy enforcement with our architecture.
Emergency evacuations during disasters minimize loss of lives and injuries. It is not surprising that emergency evacuation preparedness is mandatory for organizations in many jurisdictions. In the case of corporations, this requirement translates to considerable expenses, consisting of construction costs, equipment, recruitment, retention and training. In addition, required regular evacuation drills cause recurring expenses and loss of productivity. Any automation to assist in these drills and in actual evacuations can mean savings of costs, time and lives. Evacuation assistance systems rely on attendance systems that often fall short in accuracy, particularly in environments with lot of "non-swipers" (customers, visitors, etc.,). A critical question to answer in the case of an emergency is "How many people are still in the building?". This number is calculated by comparing the number of people gathered at assembly point to the last known number of people inside the building. An IoT based system can enhance the answer to that question by providing the number of people in the building, provide their last known locations in an automated fashion and even automate the reconciliation process. Our proposed system detects the people in the building automatically using multiple channels such as WiFi and motion detection. Such a system needs the ability to link specific identifiers to persons reliably. In this paper we present our statistics and heuristics based solutions for linking detected identifiers as belonging to an actual persons in a privacy preserving manner using IoT technologies.
Dynamic Searchable Symmetric Encryption (DSSE) allows a client to perform keyword searches over encrypted files via an encrypted data structure. Despite its merits, DSSE leaks search and update patterns when the client accesses the encrypted data structure. These leakages may create severe privacy problems as already shown, for example, in recent statistical attacks on DSSE. While Oblivious Random Access Memory (ORAM) can hide such access patterns, it incurs significant communication overhead and, therefore, it is not yet fully practical for cloud computing systems. Hence, there is a critical need to develop private access schemes over the encrypted data structure that can seal the leakages of DSSE while achieving practical search/update operations. In this paper, we propose a new oblivious access scheme over the encrypted data structure for searchable encryption purposes, that we call textlessutextgreaterDtextless/utextgreateristributed textlessutextgreaterOtextless/utextgreaterblivious textlessutextgreaterDtextless/utextgreaterata structure textlessutextgreaterDSSEtextless/utextgreater (DOD-DSSE). The main idea is to create a distributed encrypted incidence matrix on two non-colluding servers such that no arbitrary queries on these servers can be linked to each other. This strategy prevents not only recent statistical attacks on the encrypted data structure but also other potential threats exploiting query linkability. Our security analysis proves that DOD-DSSE ensures the unlink-ability of queries and, therefore, offers much higher security than traditional DSSE. At the same time, our performance evaluation demonstrates that DOD-DSSE is two orders of magnitude faster than ORAM-based techniques (e.g., Path ORAM), since it only incurs a small-constant number of communication overhead. That is, we deployed DOD-DSSE on geographically distributed Amazon EC2 servers, and showed that, a search/update operation on a very large dataset only takes around one second with DOD-DSSE, while it takes 3 to 13 minutes with Path ORAM-based methods.
Malware evolves perpetually and relies on increasingly so- phisticated attacks to supersede defense strategies. Data-driven approaches to malware detection run the risk of becoming rapidly antiquated. Keeping pace with malware requires models that are periodically enriched with fresh knowledge, commonly known as retraining. In this work, we propose the use of Venn-Abers predictors for assessing the quality of binary classification tasks as a first step towards identifying antiquated models. One of the key benefits behind the use of Venn-Abers predictors is that they are automatically well calibrated and offer probabilistic guidance on the identification of nonstationary populations of malware. Our framework is agnostic to the underlying classification algorithm and can then be used for building better retraining strategies in the presence of concept drift. Results obtained over a timeline-based evaluation with about 90K samples show that our framework can identify when models tend to become obsolete.
Among the signature schemes most widely deployed in practice are the DSA (Digital Signature Algorithm) and its elliptic curves variant ECDSA. They are represented in many international standards, including IEEE P1363, ANSI X9.62, and FIPS 186-4. Their popularity stands in stark contrast to the absence of rigorous security analyses: Previous works either study modified versions of (EC)DSA or provide a security analysis of unmodified ECDSA in the generic group model. Unfortunately, works following the latter approach assume abstractions of non-algebraic functions over generic groups for which it remains unclear how they translate to the security of ECDSA in practice. For instance, it has been pointed out that prior results in the generic group model actually establish strong unforgeability of ECDSA, a property that the scheme de facto does not possess. As, further, no formal results are known for DSA, understanding the security of both schemes remains an open problem. In this work we propose GenericDSA, a signature framework that subsumes both DSA and ECDSA in unmodified form. It carefully models the "modulo q" conversion function of (EC)DSA as a composition of three independent functions. The two outer functions mimic algebraic properties in the function's domain and range, the inner one is modeled as a bijective random oracle. We rigorously prove results on the security of GenericDSA that indicate that forging signatures in (EC)DSA is as hard as solving discrete logarithms. Importantly, our proofs do not assume generic group behavior.
Cryptographic hash functions are used to protect the integrity of information. Hash functions are designed by using existing block ciphers as compression functions. This is due to challenges and difficulties that are encountered in constructing new hash functions from the scratch. However, the key generations for encryption process result to huge computational cost which affects the efficiency of the hash function. This paper proposes a new, secure and efficient compression function based on a pseudorandom function, that takes in two 2n-bits inputs and produce one n-bit output (2n-to-n bit). In addition, a new keyed hash function with three variants is proposed (PinTar 128 bits, 256 bits and 512 bits) which uses the proposed compression as its underlying building block. Statistical analysis shows that the compression function is an efficient one way random function. Similarly, statistical analysis of the keyed hash function shows that the proposed keyed function has strong avalanche property and is resistant to key exhaustive search attack. The proposed key hash function can be used as candidate for developing security systems.
We propose pseudo random bit generator(PRBG) by constructing an ergodic orbit of logistic map generated using intermittent perturbation and sampling the orbit using Bernoulli shift map. The pseudo randomness of the key streams is tested with the standard NIST statistical test suite. The key streams generated by the proposed PRBG pass all the tests of the test suite. As an application of PRBG we propose chaotic stream ciphers. Goodness-of-Fit tests are conducted for these ciphers. Stream ciphers generated using PRBG are shown to generate ciphers in which the distribution of the characters show significant correlation to the uniform distribution. It is also shown that the ciphers exhibit good avalanche effect when the initial seed is varied infinitesimally and we can also clearly observe the avalanche effect at different points within the ciphertext as the ciphertext is being generated. Also we show that the proposed PRBG is on par with conventional PRBG RC4. This indeed vindicates our hypothesis that stream of bits obtained from orbits of logistic map using perturbation and Bernoulli shift map exhibits randomness.
With the help of rapidly developing technology, DNA sequencing is becoming less expensive. As a consequence, the research in genomics has gained speed in paving the way to personalized (genomic) medicine, and geneticists need large collections of human genomes to further increase this speed. Furthermore, individuals are using their genomes to learn about their (genetic) predispositions to diseases, their ancestries, and even their (genetic) compatibilities with potential partners. This trend has also caused the launch of health-related websites and online social networks (OSNs), in which individuals share their genomic data (e.g., OpenSNP or 23andMe). On the other hand, genomic data carries much sensitive information about its owner. By analyzing the DNA of an individual, it is now possible to learn about his disease predispositions (e.g., for Alzheimer's or Parkinson's), ancestries, and physical attributes. The threat to genomic privacy is magnified by the fact that a person's genome is correlated to his family members' genomes, thus leading to interdependent privacy risks. This short tutorial will help computer scientists better understand the privacy and security challenges in today's genomic era. We will first highlight the significance of genomic data and the threats for genomic privacy. Then, we will present the high level descriptions of the proposed solutions to protect the privacy of genomic data and we will discuss future research directions. No prerequisite knowledge on biology or genomics is required for the attendees of this proposal. We only require the attendees to have a slight background on cryptography and statistics.
Path prediction on the Internet has been a topic of research in the networking community for close to a decade. Applications of path prediction solutions have ranged from optimizing selection of peers in peer- to-peer networks to improving and debugging CDN predictions. Recently, revelations of traffic correlation and surveillance on the Internet have raised the topic of path prediction in the context of network security. Specifically, predicting network paths can allow us to identify and avoid given organizations on network paths (e.g., to avoid traffic correlation attacks in Tor) or to infer the impact of hijacks and interceptions when direct measurements are not available. In this poster we propose the design and implementation of PathCache which aims to reuse measurement data to estimate AS level paths on the Internet. Unlike similar systems, PathCache does not assume that routing on the Internet is destination based. Instead, we develop an algorithm to compute confidence in paths between ASes. These multiple paths ranked by their confidence values are returned to the user.
Pseudo-random number generators (PRNGs) are a critical infrastructure for cryptography and security of many computer applications. At the same time, PRNGs are surprisingly difficult to design, implement, and debug. This paper presents the first static analysis technique specifically for quality assurance of cryptographic PRNG implementations. The analysis targets a particular kind of implementation defect, the entropy loss. Entropy loss occurs when the entropy contained in the PRNG seed is not utilized to the full extent for generating the pseudo-random output stream. The Debian OpenSSL disaster, probably the most prominent PRNG-related security incident, was one but not the only manifestation of such a defect. Together with the static analysis technique, we present its implementation, a tool named Entroposcope. The tool offers a high degree of automation and practicality. We have applied the tool to five real-world PRNGs of different designs and show that it effectively detects both known and previously unknown instances of entropy loss.
We propose a game theoretic framework for task allocation in mobile cloud computing that corresponds to offloading of compute tasks to a group of nearby mobile devices. Specifically, in our framework, a distributor node holds a multidimensional auction for allocating the tasks of a job among nearby mobile nodes based on their computational capabilities and also the cost of computation at these nodes, with the goal of reducing the overall job completion time. Our proposed auction also has the desired incentive compatibility property that ensures that mobile devices truthfully reveal their capabilities and costs and that those devices benefit from the task allocation. To deal with node mobility, we perform multiple auctions over adaptive time intervals. We develop a heuristic approach to dynamically find the best time intervals between auctions to minimize unnecessary auctions and the accompanying overheads. We evaluate our framework and methods using both real world and synthetic mobility traces. Our evaluation results show that our game theoretic framework improves the job completion time by a factor of 2-5 in comparison to the time taken for executing the job locally, while minimizing the number of auctions and the accompanying overheads. Our approach is also profitable for the nearby nodes that execute the distributor's tasks with these nodes receiving a compensation higher than their actual costs.
Graph data publishing under node-differential privacy (node-DP) is challenging due to the huge sensitivity of queries. However, since a node in graph data oftentimes represents a person, node-DP is necessary to achieve personal data protection. In this paper, we investigate the problem of publishing the degree distribution of a graph under node-DP by exploring the projection approach to reduce the sensitivity. We propose two approaches based on aggregation and cumulative histogram to publish the degree distribution. The experiments demonstrate that our approaches greatly reduce the error of approximating the true degree distribution and have significant improvement over existing works. We also present the introspective analysis for understanding the factors of publishing the degree distribution with node-DP.
Differential privacy has become the dominant standard in the research community for strong privacy protection. There has been a flood of research into query answering algorithms that meet this standard. Algorithms are becoming increasingly complex, and in particular, the performance of many emerging algorithms is data dependent, meaning the distribution of the noise added to query answers may change depending on the input data. Theoretical analysis typically only considers the worst case, making empirical study of average case performance increasingly important. In this paper we propose a set of evaluation principles which we argue are essential for sound evaluation. Based on these principles we propose DPBench, a novel evaluation framework for standardized evaluation of privacy algorithms. We then apply our benchmark to evaluate algorithms for answering 1- and 2-dimensional range queries. The result is a thorough empirical study of 15 published algorithms on a total of 27 datasets that offers new insights into algorithm behavior–-in particular the influence of dataset scale and shape–-and a more complete characterization of the state of the art. Our methodology is able to resolve inconsistencies in prior empirical studies and place algorithm performance in context through comparison to simple baselines. Finally, we pose open research questions which we hope will guide future algorithm design.
Recent advances in differential privacy make it possible to guarantee user privacy while preserving the main characteristics of the data. However, most differential privacy mechanisms assume that the underlying dataset is clean. This paper explores the link between data cleaning and differential privacy in a framework we call PrivateClean. PrivateClean includes a technique for creating private datasets of numerical and discrete-valued attributes, a formalism for privacy-preserving data cleaning, and techniques for answering sum, count, and avg queries after cleaning. We show: (1) how the degree of privacy affects subsequent aggregate query accuracy, (2) how privacy potentially amplifies certain types of errors in a dataset, and (3) how this analysis can be used to tune the degree of privacy. The key insight is to maintain a bipartite graph relating dirty values to clean values and use this graph to estimate biases due to the interaction between cleaning and privacy. We validate these results on four datasets with a variety of well-studied cleaning techniques including using functional dependencies, outlier filtering, and resolving inconsistent attributes.
Extensible Cyber-Physical Systems (CPS) are loosely connected, multi-domain platforms that "virtualize" their resources to provide an open platform capable of hosting different cyber-physical applications. These cyber-physical platforms are extensible since resources and applications can be added or removed at any time. However, realizing such platform requires resolving challenges emanating from different properties; for this paper, we focus on resilience. Resilience is important for extensible CPS to make sure that extensibility of a system doesn't result in failures and anomalies.
Here we model the indirect costs of deploying security controls in small-to-medium enterprises (SMEs) to manage cyber threats. SMEs may not have the in-house skills and collective capacity to operate controls efficiently, resulting in inadvertent data leakage and exposure to compromise. Aside from financial costs, attempts to maintain security can impact morale, system performance, and retraining requirements, which are modelled here. Managing the overall complexity and effectiveness of an SME's security controls has the potential to reduce unintended leakage. The UK Cyber Essentials Scheme informs basic control definitions, and Available Responsibility Budget (ARB) is modelled to understand how controls can be prioritised for both security and usability. Human factors of security and practical experience of security management for SMEs inform the modelling of deployment challenges across a set of SME archetypes differing in size, complexity, and use of IT. Simple combinations of controls are matched to archetypes, balancing capabilities to protect data assets with the effort demands placed upon employees. Experiments indicate that two-factor authentication can be readily adopted by many SMEs and their employees to protect core assets, followed by correct access privileges and anti-malware software. Service and technology providers emerge as playing an important role in improving access to usable security controls for SMEs.
Head portraits are popular in traditional painting. Automating portrait painting is challenging as the human visual system is sensitive to the slightest irregularities in human faces. Applying generic painting techniques often deforms facial structures. On the other hand portrait painting techniques are mainly designed for the graphite style and/or are based on image analogies; an example painting as well as its original unpainted version are required. This limits their domain of applicability. We present a new technique for transferring the painting from a head portrait onto another. Unlike previous work our technique only requires the example painting and is not restricted to a specific style. We impose novel spatial constraints by locally transferring the color distributions of the example painting. This better captures the painting texture and maintains the integrity of facial structures. We generate a solution through Convolutional Neural Networks and we present an extension to video. Here motion is exploited in a way to reduce temporal inconsistencies and the shower-door effect. Our approach transfers the painting style while maintaining the input photograph identity. In addition it significantly reduces facial deformations over state of the art.
In many domains, a plethora of textual information is available on the web as news reports, blog posts, community portals, etc. Information extraction (IE) is the default technique to turn unstructured text into structured fact databases, but systematically applying IE techniques to web input requires highly complex systems, starting from focused crawlers over quality assurance methods to cope with the HTML input to long pipelines of natural language processing and IE algorithms. Although a number of tools for each of these steps exists, their seamless, flexible, and scalable combination into a web scale end-to-end text analytics system still is a true challenge. In this paper, we report our experiences from building such a system for comparing the "web view" on health related topics with that derived from a controlled scientific corpus, i.e., Medline. The system combines a focused crawler, applying shallow text analysis and classification to maintain focus, with a sophisticated text analytic engine inside the Big Data processing system Stratosphere. We describe a practical approach to seed generation which led us crawl a corpus of \textasciitilde1 TB web pages highly enriched for the biomedical domain. Pages were run through a complex pipeline of best-of-breed tools for a multitude of necessary tasks, such as HTML repair, boilerplate detection, sentence detection, linguistic annotation, parsing, and eventually named entity recognition for several types of entities. Results are compared with those from running the same pipeline (without the web-related tasks) on a corpus of 24 million scientific abstracts and a third corpus made of \textasciitilde250K scientific full texts. We evaluate scalability, quality, and robustness of the employed methods and tools. The focus of this paper is to provide a large, real-life use case to inspire future research into robust, easy-to-use, and scalable methods for domain-specific IE at web scale.
Radio network information is leaked well beyond the perimeter in which the radio network is deployed. We investigate attacks where person location can be inferred using the radio characteristics of wireless links (e.g., the received signal strength). An attacker can deploy a network of receivers which measure the received signal strength of the radio signals transmitted by the legitimate wireless devices inside a perimeter, allowing the attacker to learn the locations of people moving in the vicinity of the devices inside the perimeter. In this paper, we develop the first solution to this location privacy problem where neither the attacker nodes nor the tracked moving object transmit any RF signals. We first model the radio network leakage attack using a Stackelberg game. Next, we define utility and cost functions related to the defender and attacker actions. Last, using our utility and cost functions, we find the optimal strategy for the defender by applying a greedy method. We evaluate our game theoretic framework using experiments and find that our approach significantly reduces the chance of an attacker determining the location of people inside a perimeter.
As information systems become increasingly interdependent, there is an increased need to share cybersecurity data across government agencies and companies, and within and across industrial sectors. This sharing includes threat, vulnerability and incident reporting data, among other data. For cyberattacks that include sociotechnical vectors, such as phishing or watering hole attacks, this increased sharing could expose customer and employee personal data to increased privacy risk. In the US, privacy risk arises when the government voluntarily receives data from companies without meaningful consent from individuals, or without a lawful procedure that protects an individual's right to due process. In this paper, we describe a study to examine the trade-off between the need for potentially sensitive data, which we call incident data usage, and the perceived privacy risk of sharing that data with the government. The study is comprised of two parts: a data usage estimate built from a survey of 76 security professionals with mean eight years' experience; and a privacy risk estimate that measures privacy risk using an ordinal likelihood scale and nominal data types in factorial vignettes. The privacy risk estimate also factors in data purposes with different levels of societal benefit, including terrorism, imminent threat of death, economic harm, and loss of intellectual property. The results show which data types are high-usage, low-risk versus those that are low-usage, high-risk. We discuss the implications of these results and recommend future work to improve privacy when data must be shared despite the increased risk to privacy.