Biblio
Popular anonymity mechanisms such as Tor provide low communication latency but are vulnerable to traffic analysis attacks that can de-anonymize users. Moreover, known traffic-analysis-resistant techniques such as Dissent are impractical for use in latency-sensitive settings such as wireless networks. In this paper, we propose PriFi, a low-latency protocol for anonymous communication in local area networks that is provably secure against traffic analysis attacks. This allows members of an organization to access the Internet anonymously while they are on-site, via privacy-preserving WiFi networking, or off-site, via privacy-preserving virtual private networking (VPN). PriFi reduces communication latency using a client/relay/server architecture in which a set of servers computes cryptographic material in parallel with the clients to minimize unnecessary communication latency. We also propose a technique for protecting against equivocation attacks, with which a malicious relay might de-anonymize clients. This is achieved without adding extra latency by encrypting client messages based on the history of all messages they have received so far. As a result, any equivocation attempt makes the communication unintelligible, preserving clients' anonymity while holding the servers accountable.
With the help of rapidly developing technology, DNA sequencing is becoming less expensive. As a consequence, the research in genomics has gained speed in paving the way to personalized (genomic) medicine, and geneticists need large collections of human genomes to further increase this speed. Furthermore, individuals are using their genomes to learn about their (genetic) predispositions to diseases, their ancestries, and even their (genetic) compatibilities with potential partners. This trend has also caused the launch of health-related websites and online social networks (OSNs), in which individuals share their genomic data (e.g., OpenSNP or 23andMe). On the other hand, genomic data carries much sensitive information about its owner. By analyzing the DNA of an individual, it is now possible to learn about his disease predispositions (e.g., for Alzheimer's or Parkinson's), ancestries, and physical attributes. The threat to genomic privacy is magnified by the fact that a person's genome is correlated to his family members' genomes, thus leading to interdependent privacy risks. This short tutorial will help computer scientists better understand the privacy and security challenges in today's genomic era. We will first highlight the significance of genomic data and the threats for genomic privacy. Then, we will present the high level descriptions of the proposed solutions to protect the privacy of genomic data and we will discuss future research directions. No prerequisite knowledge on biology or genomics is required for the attendees of this proposal. We only require the attendees to have a slight background on cryptography and statistics.
Processing smart grid data for analytics purposes brings about a series of privacy-related risks. In order to allow for the most suitable mitigation strategies, reasonable privacy risks need to be addressed by taking into consideration the perspective of each smart grid stakeholder separately. In this context, we use the notion of privacy concerns to reflect potential privacy risks from the perspective of different smart grid stakeholders. Privacy concerns help to derive privacy goals, which we represent using the goals structuring notation. Thus represented goals can more comprehensibly be addressed through technical and non-technical strategies and solutions. The thread of argumentation - from concerns to goals to strategies and solutions - is presented in form of a privacy case, which is analogous to the safety case used in the automotive domain. We provide an exemplar privacy case for the smart grid developed as part of the Aspern Smart City Research project.
The mainstream approach to protecting the privacy of mobile users in location-based services (LBSs) is to alter (e.g., perturb, hide, and so on) the users’ actual locations in order to reduce exposed sensitive information. In order to be effective, a location-privacy preserving mechanism must consider both the privacy and utility requirements of each user, as well as the user’s overall exposed locations (which contribute to the adversary’s background knowledge). In this article, we propose a methodology that enables the design of optimal user-centric location obfuscation mechanisms respecting each individual user’s service quality requirements, while maximizing the expected error that the optimal adversary incurs in reconstructing the user’s actual trace. A key advantage of a user-centric mechanism is that it does not depend on third-party proxies or anonymizers; thus, it can be directly integrated in the mobile devices that users employ to access LBSs. Our methodology is based on the mutual optimization of user/adversary objectives (maximizing location privacy versus minimizing localization error) formalized as a Stackelberg Bayesian game. This formalization makes our solution robust against any location inference attack, that is, the adversary cannot decrease the user’s privacy by designing a better inference algorithm as long as the obfuscation mechanism is designed according to our privacy games. We develop two linear programs that solve the location privacy game and output the optimal obfuscation strategy and its corresponding optimal inference attack. These linear programs are used to design location privacy–preserving mechanisms that consider the correlation between past, current, and future locations of the user, thus can be tuned to protect different privacy objectives along the user’s location trace. We illustrate the efficacy of the optimal location privacy–preserving mechanisms obtained with our approach against real location traces, showing their performance in protecting users’ different location privacy objectives.
A major challenge for utilities is energy theft, wherein malicious actors steal energy for financial gain. One such form of theft in the smart grid is the fraudulent amplification of energy generation measurements from DERs, such as photo-voltaics. It is important to detect this form of malicious activity, but in a way that ensures the privacy of customers. Not considering privacy aspects could result in a backlash from customers and a heavily curtailed deployment of services, for example. In this short paper, we present a novel privacy-preserving approach to the detection of manipulated DER generation measurements.
Cloud computing, often referred to as simply “the cloud,” is the delivery of on-demand computing resources; everything from applications to data centers over the Internet. Cloud is used not only for storing data, but also the stored data can be shared by multiple users. Due to this, the integrity of cloud data is subject to doubt. Every time it is not possible for user to download all data and verify integrity, so proposed system contain Third Party Auditor (TPA) to verify the integrity of shared data. During auditing, the shared data is kept private from public verifiers, who are able to verify shared data integrity without downloading or retrieving the entire data file. Group signature is used to preserve identity privacy of group members from third party auditor. Privacy preserving is done to ensure that the TPA cannot derive user's data content from the information collected during the auditing process.
In recent times, we have seen a proliferation of personal data. This can be attributed not just to a larger proportion of our lives moving online, but also through the rise of ubiquitous sensing through mobile and IoT devices. Alongside this surge, concerns over privacy, trust, and security are expressed more and more as different parties attempt to take advantage of this rich assortment of data. The Databox seeks to enable all the advantages of personal data analytics while at the same time enforcing **accountability** and **control** in order to protect a user's privacy. In this work, we propose and delineate a personal networked device that allows users to **collate**, **curate**, and **mediate** their personal data.
The Internet of Things (IoT) offers new opportunities, but alongside those come many challenges for security and privacy. Most IoT devices offer no choice to users of where data is published, which data is made available and what identities are used for both devices and users. The aim of this work is to explore new middleware models and techniques that can provide users with more choice as well as enhance privacy and security. This paper outlines a new model and a prototype of a middleware system that implements this model.
Provenance workflows capture movement and transformation of data in complex environments, such as document management in large organizations, content generation and sharing in in social media, scientific computations, etc. Sharing and processing of provenance workflows brings numerous benefits, e.g., improving productivity in an organization, understanding social media interaction patterns, etc. However, directly sharing provenance may also disclose sensitive information such as confidential business practices, or private details about participants in a social network. We propose an algorithm that privately extracts sequential association rules from provenance workflow datasets. Finding such rules has numerous practical applications, such as capacity planning or identifying hot-spots in provenance graphs. Our approach provides good accuracy and strong privacy, by leveraging on the exponential mechanism of differential privacy. We propose an heuristic that identifies promising candidate rules and makes judicious use of the privacy budget. Experimental results show that the our approach is fast and accurate, and clearly outperforms the state-of-the-art. We also identify influential factors in improving accuracy, which helps in choosing promising directions for future improvement.
Biometric verification systems have security issues regarding the storage of biometric data in that a user's biometric features cannot be changed into other ones even when a system is compromised. To address this issue, it may be safe to store the biometrics data on a reliable remote server instead of storing them in a local device. However, this approach may raise a privacy issue. In this paper, we propose a biometric verification system where the biometric data are stored in a remote server in an encrypted form and the similarity of the user input to the registered biometric data is computed in an encrypted domain using a homomorphic encryption. We evaluated the performance of the proposed system through an implementation on an Android-based smartphone and an i7-based server.
Recent advances in differential privacy make it possible to guarantee user privacy while preserving the main characteristics of the data. However, most differential privacy mechanisms assume that the underlying dataset is clean. This paper explores the link between data cleaning and differential privacy in a framework we call PrivateClean. PrivateClean includes a technique for creating private datasets of numerical and discrete-valued attributes, a formalism for privacy-preserving data cleaning, and techniques for answering sum, count, and avg queries after cleaning. We show: (1) how the degree of privacy affects subsequent aggregate query accuracy, (2) how privacy potentially amplifies certain types of errors in a dataset, and (3) how this analysis can be used to tune the degree of privacy. The key insight is to maintain a bipartite graph relating dirty values to clean values and use this graph to estimate biases due to the interaction between cleaning and privacy. We validate these results on four datasets with a variety of well-studied cleaning techniques including using functional dependencies, outlier filtering, and resolving inconsistent attributes.
Given a set D of tuples defined on a domain Omega, we study differentially private algorithms for constructing a histogram over Omega to approximate the tuple distribution in D. Existing solutions for the problem mostly adopt a hierarchical decomposition approach, which recursively splits Omega into sub-domains and computes a noisy tuple count for each sub-domain, until all noisy counts are below a certain threshold. This approach, however, requires that we (i) impose a limit h on the recursion depth in the splitting of Omega and (ii) set the noise in each count to be proportional to h. The choice of h is a serious dilemma: a small h makes the resulting histogram too coarse-grained, while a large h leads to excessive noise in the tuple counts used in deciding whether sub-domains should be split. Furthermore, h cannot be directly tuned based on D; otherwise, the choice of h itself reveals private information and violates differential privacy. To remedy the deficiency of existing solutions, we present PrivTree, a histogram construction algorithm that adopts hierarchical decomposition but completely eliminates the dependency on a pre-defined h. The core of PrivTree is a novel mechanism that (i) exploits a new analysis on the Laplace distribution and (ii) enables us to use only a constant amount of noise in deciding whether a sub-domain should be split, without worrying about the recursion depth of splitting. We demonstrate the application of PrivTree in modelling spatial data, and show that it can be extended to handle sequence data (where the decision in sub-domain splitting is not based on tuple counts but a more sophisticated measure). Our experiments on a variety of real datasets show that PrivTree considerably outperforms the states of the art in terms of data utility.
Cloud computing provides a shared pool of resources for large-scale distributed applications. Recent trends such as fog computing and edge computing spread the workload of clouds closer towards the edge of the network and the users. Exploiting the edge resources efficiently requires managing the resources and directing user traffic to the correct edge servers. In this paper we propose to profile and group users according to their interest profiles. We consider edge caching as an example and through our evaluation show the potential benefits of directing users from the same group to the same caches. We investigate a range of workloads and parameters and the same conclusions apply. Our results highlight the importance of grouping users and demonstrate the potential benefits of this approach.
Physical layer security for wireless communication is broadly considered as a promising approach to protect data confidentiality against eavesdroppers. However, despite its ample theoretical foundation, the transition to practical implementations of physical-layer security still lacks success. A close inspection of proven vulnerable physical-layer security designs reveals that the flaws are usually overlooked when the scheme is only evaluated against an inferior, single-antenna eavesdropper. Meanwhile, the attacks exposing vulnerabilities often lack theoretical justification. To reduce the gap between theory and practice, we posit that a physical-layer security scheme must be studied under multiple adversarial models to fully grasp its security strength. In this regard, we evaluate a specific physical-layer security scheme, i.e. orthogonal blinding, under multiple eavesdropper settings. We further propose a practical "ciphertext-only attack" that allows eavesdroppers to recover the original message by exploiting the low entropy fields in wireless packets. By means of simulation, we are able to reduce the symbol error rate at an eavesdropper below 1% using only the eavesdropper's receiving data and a general knowledge about the format of the wireless packets.
We propose a game theoretic framework for task allocation in mobile cloud computing that corresponds to offloading of compute tasks to a group of nearby mobile devices. Specifically, in our framework, a distributor node holds a multidimensional auction for allocating the tasks of a job among nearby mobile nodes based on their computational capabilities and also the cost of computation at these nodes, with the goal of reducing the overall job completion time. Our proposed auction also has the desired incentive compatibility property that ensures that mobile devices truthfully reveal their capabilities and costs and that those devices benefit from the task allocation. To deal with node mobility, we perform multiple auctions over adaptive time intervals. We develop a heuristic approach to dynamically find the best time intervals between auctions to minimize unnecessary auctions and the accompanying overheads. We evaluate our framework and methods using both real world and synthetic mobility traces. Our evaluation results show that our game theoretic framework improves the job completion time by a factor of 2-5 in comparison to the time taken for executing the job locally, while minimizing the number of auctions and the accompanying overheads. Our approach is also profitable for the nearby nodes that execute the distributor's tasks with these nodes receiving a compensation higher than their actual costs.
In recent years, the analog-to-information converter (AIC), based on compressed sensing (CS) paradigm, is a promising solution to overcome the performance and energy-efficiency limitations of traditional analog-to-digital converters (ADC). Especially, AIC can enable sub-Nyquist signal sampling proportional to the intrinsic information in biomedical applications. However, the legacy AIC structure is tailored toward specific applications, which lacks of flexibility and prevents its universality. In this paper, we introduce a novel programmable AIC architecture, Pro-AIC, to enable effective configurability and reduce its energy overhead by integrating efficient multiplexing hardware design. To improve the quality and time-efficiency of Pro-AIC configuration, we also develop a rapid configuration algorithm, called RapSpiral, to quickly find the near-optimal parameter configuration in Pro-AIC architecture. Specifically, we present a design metric, trade-off penalty, to quantitatively evaluate the performance-energy trade-off. The RapSpiral controls a penalty-driven shrinking triangle to progressively approximate to the optimal trade-off. Our proposed RapSpiral is with log(n) complexity yet high accuracy, without pretraining and complex parameter tuning procedure. RapSpiral is also probable to avoid the local minimum pitfalls. Experimental results indicate that our RapSpiral algorithm can achieve more than 30x speedup compared with the brute force algorithm, with only about 3% trade-off compromise to the optimum in Pro-AIC. Furthermore, the scalability is also verified on larger size benchmarks.
Among the signature schemes most widely deployed in practice are the DSA (Digital Signature Algorithm) and its elliptic curves variant ECDSA. They are represented in many international standards, including IEEE P1363, ANSI X9.62, and FIPS 186-4. Their popularity stands in stark contrast to the absence of rigorous security analyses: Previous works either study modified versions of (EC)DSA or provide a security analysis of unmodified ECDSA in the generic group model. Unfortunately, works following the latter approach assume abstractions of non-algebraic functions over generic groups for which it remains unclear how they translate to the security of ECDSA in practice. For instance, it has been pointed out that prior results in the generic group model actually establish strong unforgeability of ECDSA, a property that the scheme de facto does not possess. As, further, no formal results are known for DSA, understanding the security of both schemes remains an open problem. In this work we propose GenericDSA, a signature framework that subsumes both DSA and ECDSA in unmodified form. It carefully models the "modulo q" conversion function of (EC)DSA as a composition of three independent functions. The two outer functions mimic algebraic properties in the function's domain and range, the inner one is modeled as a bijective random oracle. We rigorously prove results on the security of GenericDSA that indicate that forging signatures in (EC)DSA is as hard as solving discrete logarithms. Importantly, our proofs do not assume generic group behavior.
Blind signature can be deployed to preserve user anonymity and is widely used in digital cash and e-voting. As an interactive protocol, blind signature schemes require high efficiency. In this paper, we propose a code-based blind signature scheme with high efficiency as it can produce a valid signature without many loops unlike existing code-based signature schemes. We then prove the security of our scheme in the random oracle model and analyze the efficiency of our scheme. Since a code-based signature scheme is post-quantum cryptography, therefore, the scheme is also able to resist quantum attacks.
Blind signature can be deployed to preserve user anonymity and is widely used in digital cash and e-voting. As an interactive protocol, blind signature schemes require high efficiency. In this paper, we propose a code-based blind signature scheme with high efficiency as it can produce a valid signature without many loops unlike existing code-based signature schemes. We then prove the security of our scheme in the random oracle model and analyze the efficiency of our scheme. Since a code-based signature scheme is post-quantum cryptography, therefore, the scheme is also able to resist quantum attacks.
Collecting and processing provenance, i.e., information describing the production process of some end product, is important in various applications, e.g., to assess quality, to ensure reproducibility, or to reinforce trust in the end product. In the past, different types of provenance meta-data have been proposed, each with a different scope. The first part of the proposed tutorial provides an overview and comparison of these different types of provenance. To put provenance to good use, it is essential to be able to interact with and present provenance data in a user-friendly way. Often, users interested in provenance are not necessarily experts in databases or query languages, as they are typically domain experts of the product and production process for which provenance is collected (biologists, journalists, etc.). Furthermore, in some scenarios, it is difficult to use solely queries for analyzing and exploring provenance data. The second part of this tutorial therefore focuses on enabling users to leverage provenance through adapted visualizations. To this end, we will present some fundamental concepts of visualization before we discuss possible visualizations for provenance.
One essential requirement for supporting analytics for Big Medical Data systems is the provision of a suitable level of traceability to data or processes ('Items') in large volumes of data. Systems should be designed from the outset to support usage of such Items across the spectrum of medical use and over time in order to promote traceability, to simplify maintenance and to assist analytics. The philosophy proposed in this paper is to design medical data systems using a 'description-driven' approach in which meta-data and the description of medical items are saved alongside the data, simplifying item re-use over time and thereby enabling the traceability of these items over time and their use in analytics. Details are given of a big data system in neuroimaging to demonstrate aspects of provenance data capture, collaborative analysis and longitudinal information traceability. Evidence is presented that the description-driven approach leads to simplicity of design and ease of maintenance following the adoption of a unified approach to Item management.
Defenders of enterprise networks have a critical need to quickly identify the root causes of malware and data leakage. Increasingly, USB storage devices are the media of choice for data exfiltration, malware propagation, and even cyber-warfare. We observe that a critical aspect of explaining and preventing such attacks is understanding the provenance of data (i.e., the lineage of data from its creation to current state) on USB devices as a means of ensuring their safe usage. Unfortunately, provenance tracking is not offered by even sophisticated modern devices. This work presents ProvUSB, an architecture for fine-grained provenance collection and tracking on smart USB devices. ProvUSB maintains data provenance by recording reads and writes at the block layer and reliably identifying hosts editing those blocks through attestation over the USB channel. Our evaluation finds that ProvUSB imposes a one-time 850 ms overhead during USB enumeration, but approaches nearly-bare-metal runtime performance (90% of throughput) on larger files during normal execution, and less than 0.1% storage overhead for provenance in real-world workloads. ProvUSB thus provides essential new techniques in the defense of computer systems and USB storage devices.
Cloud service providers offer storage outsourcing facility to their clients. In a secure cloud storage (SCS) protocol, the integrity of the client's data is maintained. In this work, we construct a publicly verifiable secure cloud storage protocol based on a secure network coding (SNC) protocol where the client can update the outsourced data as needed. To the best of our knowledge, our scheme is the first SNC-based SCS protocol for dynamic data that is secure in the standard model and provides privacy-preserving audits in a publicly verifiable setting. Furthermore, we discuss, in details, about the (im)possibility of providing a general construction of an efficient SCS protocol for dynamic data (DSCS protocol) from an arbitrary SNC protocol. In addition, we modify an existing DSCS scheme (DPDP I) in order to support privacy-preserving audits. We also compare our DSCS protocol with other SCS schemes (including the modified DPDP I scheme). Finally, we figure out some limitations of an SCS scheme constructed using an SNC protocol.
Graph data publishing under node-differential privacy (node-DP) is challenging due to the huge sensitivity of queries. However, since a node in graph data oftentimes represents a person, node-DP is necessary to achieve personal data protection. In this paper, we investigate the problem of publishing the degree distribution of a graph under node-DP by exploring the projection approach to reduce the sensitivity. We propose two approaches based on aggregation and cumulative histogram to publish the degree distribution. The experiments demonstrate that our approaches greatly reduce the error of approximating the true degree distribution and have significant improvement over existing works. We also present the introspective analysis for understanding the factors of publishing the degree distribution with node-DP.