Visible to the public Biblio

Filters: Keyword is privacy models and measurement  [Clear All Filters]
2020-04-20
Hu, Boyang, Yan, Qiben, Zheng, Yao.  2018.  Tracking location privacy leakage of mobile ad networks at scale. IEEE INFOCOM 2018 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). :1–2.
The online advertising ecosystem is built upon the massive data collection of ad networks to learn the properties of users for targeted ad deliveries. Existing efforts have investigated the privacy leakage behaviors of mobile ad networks. However, there lacks a large-scale measurement study to evaluate the scale of privacy leakage through mobile ads. In this work, we present a study of the potential privacy leakage in location-based mobile advertising services based on a large-scale measurement. We first introduce a threat model in the mobile ad ecosystem, and then design a measurement system to perform extensive threat measurements and assessments. To counteract the privacy leakage threats, we design and implement an adaptive location obfuscation mechanism, which can be used to obfuscate location data in real-time while minimizing the impact to mobile ad businesses.
Esquivel-Quiros, Luis Gustavo, Barrantes, Elena Gabriela, Darlington, Fernando Esponda.  2018.  Measuring data privacy preserving and machine learning. 2018 7th International Conference On Software Process Improvement (CIMPS). :85–94.

The increasing publication of large amounts of data, theoretically anonymous, can lead to a number of attacks on the privacy of people. The publication of sensitive data without exposing the data owners is generally not part of the software developers concerns. The regulations for the data privacy-preserving create an appropriate scenario to focus on privacy from the perspective of the use or data exploration that takes place in an organization. The increasing number of sanctions for privacy violations motivates the systematic comparison of three known machine learning algorithms in order to measure the usefulness of the data privacy preserving. The scope of the evaluation is extended by comparing them with a known privacy preservation metric. Different parameter scenarios and privacy levels are used. The use of publicly available implementations, the presentation of the methodology, explanation of the experiments and the analysis allow providing a framework of work on the problem of the preservation of privacy. Problems are shown in the measurement of the usefulness of the data and its relationship with the privacy preserving. The findings motivate the need to create optimized metrics on the privacy preferences of the owners of the data since the risks of predicting sensitive attributes by means of machine learning techniques are not usually eliminated. In addition, it is shown that there may be a hundred percent, but it cannot be measured. As well as ensuring adequate performance of machine learning models that are of interest to the organization that data publisher.

2020-04-03
Jabeen, Gul, Ping, Luo.  2019.  A Unified Measurable Software Trustworthy Model Based on Vulnerability Loss Speed Index. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :18—25.

As trust becomes increasingly important in the software domain. Due to its complex composite concept, people face great challenges, especially in today's dynamic and constantly changing internet technology. In addition, measuring the software trustworthiness correctly and effectively plays a significant role in gaining users trust in choosing different software. In the context of security, trust is previously measured based on the vulnerability time occurrence to predict the total number of vulnerabilities or their future occurrence time. In this study, we proposed a new unified index called "loss speed index" that integrates the most important variables of software security such as vulnerability occurrence time, number and severity loss, which are used to evaluate the overall software trust measurement. Based on this new definition, a new model called software trustworthy security growth model (STSGM) has been proposed. This paper also aims at filling the gap by addressing the severity of vulnerabilities and proposed a vulnerability severity prediction model, the results are further evaluated by STSGM to estimate the future loss speed index. Our work has several features such as: (1) It is used to predict the vulnerability severity/type in future, (2) Unlike traditional evaluation methods like expert scoring, our model uses historical data to predict the future loss speed of software, (3) The loss metric value is used to evaluate the risk associated with different software, which has a direct impact on software trustworthiness. Experiments performed on real software vulnerability datasets and its results are analyzed to check the correctness and effectiveness of the proposed model.

2017-05-18
Hosseinzadeh, Shohreh, Virtanen, Seppo, Díaz-Rodríguez, Natalia, Lilius, Johan.  2016.  A Semantic Security Framework and Context-aware Role-based Access Control Ontology for Smart Spaces. Proceedings of the International Workshop on Semantic Big Data. :8:1–8:6.

Smart Spaces are composed of heterogeneous sensors and devices that collect and share information. This information may contain personal information of the users. Thus, securing the data and preserving the privacy are of paramount importance. In this paper, we propose techniques for information security and privacy protection for Smart Spaces based on the Smart-M3 platform. We propose a) a security framework, and b) a context-aware role-based access control scheme. We model our access control scheme using ontological techniques and Web Ontology Language (OWL), and implement it via CLIPS rules. To evaluate the efficiency of our access control scheme, we measure the time it takes to check the access rights of the access requests. The results demonstrate that the highest response time is approximately 0.2 seconds in a set of 100000 triples. We conclude that the proposed access control scheme produces low overhead and is therefore, an efficient approach for Smart Spaces.

Maleki, Hoda, Valizadeh, Saeed, Koch, William, Bestavros, Azer, van Dijk, Marten.  2016.  Markov Modeling of Moving Target Defense Games. Proceedings of the 2016 ACM Workshop on Moving Target Defense. :81–92.

We introduce a Markov-model-based framework for Moving Target Defense (MTD) analysis. The framework allows modeling of a broad range of MTD strategies, provides general theorems about how the probability of a successful adversary defeating an MTD strategy is related to the amount of time/cost spent by the adversary, and shows how a multilevel composition of MTD strategies can be analyzed by a straightforward combination of the analysis for each one of these strategies. Within the proposed framework we define the concept of security capacity which measures the strength or effectiveness of an MTD strategy: the security capacity depends on MTD specific parameters and more general system parameters. We apply our framework to two concrete MTD strategies.

Wang, Weina, Ying, Lei, Zhang, Junshan.  2016.  The Value of Privacy: Strategic Data Subjects, Incentive Mechanisms and Fundamental Limits. Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science. :249–260.

We study the value of data privacy in a game-theoretic model of trading private data, where a data collector purchases private data from strategic data subjects (individuals) through an incentive mechanism. The private data of each individual represents her knowledge about an underlying state, which is the information that the data collector desires to learn. Different from most of the existing work on privacy-aware surveys, our model does not assume the data collector to be trustworthy. Then, an individual takes full control of its own data privacy and reports only a privacy-preserving version of her data. In this paper, the value of ε units of privacy is measured by the minimum payment of all nonnegative payment mechanisms, under which an individual's best response at a Nash equilibrium is to report the data with a privacy level of ε. The higher ε is, the less private the reported data is. We derive lower and upper bounds on the value of privacy which are asymptotically tight as the number of data subjects becomes large. Specifically, the lower bound assures that it is impossible to use less amount of payment to buy ε units of privacy, and the upper bound is given by an achievable payment mechanism that we designed. Based on these fundamental limits, we further derive lower and upper bounds on the minimum total payment for the data collector to achieve a given learning accuracy target, and show that the total payment of the designed mechanism is at most one individual's payment away from the minimum.

Shamsi, Zain, Loguinov, Dmitri.  2016.  Unsupervised Clustering Under Temporal Feature Volatility in Network Stack Fingerprinting. Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science. :127–138.

Maintaining and updating signature databases is a tedious task that normally requires a large amount of user effort. The problem becomes harder when features can be distorted by observation noise, which we call volatility. To address this issue, we propose algorithms and models to automatically generate signatures in the presence of noise, with a focus on stack fingerprinting, which is a research area that aims to discover the operating system (OS) of remote hosts using TCP/IP packets. Armed with this framework, we construct a database with 420 network stacks, label the signatures, develop a robust classifier for this database, and fingerprint 66M visible webservers on the Internet.

Foremski, Pawel, Plonka, David, Berger, Arthur.  2016.  Entropy/IP: Uncovering Structure in IPv6 Addresses. Proceedings of the 2016 Internet Measurement Conference. :167–181.

In this paper, we introduce Entropy/IP: a system that discovers Internet address structure based on analyses of a subset of IPv6 addresses known to be active, i.e., training data, gleaned by readily available passive and active means. The system is completely automated and employs a combination of information-theoretic and machine learning techniques to probabilistically model IPv6 addresses. We present results showing that our system is effective in exposing structural characteristics of portions of the active IPv6 Internet address space, populated by clients, services, and routers. In addition to visualizing the address structure for exploration, the system uses its models to generate candidate addresses for scanning. For each of 15 evaluated datasets, we train on 1K addresses and generate 1M candidates for scanning. We achieve some success in 14 datasets, finding up to 40% of the generated addresses to be active. In 11 of these datasets, we find active network identifiers (e.g., /64 prefixes or "subnets") not seen in training. Thus, we provide the first evidence that it is practical to discover subnets and hosts by scanning probabilistically selected areas of the IPv6 address space not known to contain active hosts a priori.

Sealfon, Adam.  2016.  Shortest Paths and Distances with Differential Privacy. Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. :29–41.

We introduce a model for differentially private analysis of weighted graphs in which the graph topology (υ,ε) is assumed to be public and the private information consists only of the edge weights ω : ε → R+. This can express hiding congestion patterns in a known system of roads. Differential privacy requires that the output of an algorithm provides little advantage, measured by privacy parameters ε and δ, for distinguishing between neighboring inputs, which are thought of as inputs that differ on the contribution of one individual. In our model, two weight functions w,w' are considered to be neighboring if they have l1 distance at most one. We study the problems of privately releasing a short path between a pair of vertices and of privately releasing approximate distances between all pairs of vertices. We are concerned with the approximation error, the difference between the length of the released path or released distance and the length of the shortest path or actual distance. For the problem of privately releasing a short path between a pair of vertices, we prove a lower bound of Ω(textbarυtextbar) on the additive approximation error for fixed privacy parameters ε,δ. We provide a differentially private algorithm that matches this error bound up to a logarithmic factor and releases paths between all pairs of vertices, not just a single pair. The approximation error achieved by our algorithm can be bounded by the number of edges on the shortest path, so we achieve better accuracy than the worst-case bound for pairs of vertices that are connected by a low-weight path consisting of o(textbarυtextbar) vertices. For the problem of privately releasing all-pairs distances, we show that for trees we can release all-pairs distances with approximation error \$O(log2.5textbarυtextbar) for fixed privacy parameters. For arbitrary bounded-weight graphs with edge weights in [0,M] we can brelease all distances with approximation error Õ(√textgreater(textbarυtextbarM).

Miller, Carl A., Shi, Yaoyun.  2016.  Robust Protocols for Securely Expanding Randomness and Distributing Keys Using Untrusted Quantum Devices. J. ACM. 63:33:1–33:63.

Randomness is a vital resource for modern-day information processing, especially for cryptography. A wide range of applications critically rely on abundant, high-quality random numbers generated securely. Here, we show how to expand a random seed at an exponential rate without trusting the underlying quantum devices. Our approach is secure against the most general adversaries, and has the following new features: cryptographic level of security, tolerating a constant level of imprecision in devices, requiring only unit size quantum memory (for each device component) in an honest implementation, and allowing a large natural class of constructions for the protocol. In conjunction with a recent work by Chung et al. [2014], it also leads to robust unbounded expansion using just 2 multipart devices. When adapted for distributing cryptographic keys, our method achieves, for the first time, exponential expansion combined with cryptographic security and noise tolerance. The proof proceeds by showing that the Rényi divergence of the outputs of the protocol (for a specific bounding operator) decreases linearly as the protocol iterates. At the heart of the proof are a new uncertainty principle on quantum measurements and a method for simulating trusted measurements with untrusted devices.

Dimopoulos, Giorgos, Leontiadis, Ilias, Barlet-Ros, Pere, Papagiannaki, Konstantina.  2016.  Measuring Video QoE from Encrypted Traffic. Proceedings of the 2016 Internet Measurement Conference. :513–526.

Tracking and maintaining satisfactory QoE for video streaming services is becoming a greater challenge for mobile network operators than ever before. Downloading and watching video content on mobile devices is currently a growing trend among users, that is causing a demand for higher bandwidth and better provisioning throughout the network infrastructure. At the same time, popular demand for privacy has led many online streaming services to adopt end-to-end encryption, leaving providers with only a handful of indicators for identifying QoE issues. In order to address these challenges, we propose a novel methodology for detecting video streaming QoE issues from encrypted traffic. We develop predictive models for detecting different levels of QoE degradation that is caused by three key influence factors, i.e. stalling, the average video quality and the quality variations. The models are then evaluated on the production network of a large scale mobile operator, where we show that despite encryption our methodology is able to accurately detect QoE problems with 72\textbackslash%-92\textbackslash% accuracy, while even higher performance is achieved when dealing with cleartext traffic

2017-03-29
Zhang, Jun, Xiao, Xiaokui, Xie, Xing.  2016.  PrivTree: A Differentially Private Algorithm for Hierarchical Decompositions. Proceedings of the 2016 International Conference on Management of Data. :155–170.

Given a set D of tuples defined on a domain Omega, we study differentially private algorithms for constructing a histogram over Omega to approximate the tuple distribution in D. Existing solutions for the problem mostly adopt a hierarchical decomposition approach, which recursively splits Omega into sub-domains and computes a noisy tuple count for each sub-domain, until all noisy counts are below a certain threshold. This approach, however, requires that we (i) impose a limit h on the recursion depth in the splitting of Omega and (ii) set the noise in each count to be proportional to h. The choice of h is a serious dilemma: a small h makes the resulting histogram too coarse-grained, while a large h leads to excessive noise in the tuple counts used in deciding whether sub-domains should be split. Furthermore, h cannot be directly tuned based on D; otherwise, the choice of h itself reveals private information and violates differential privacy. To remedy the deficiency of existing solutions, we present PrivTree, a histogram construction algorithm that adopts hierarchical decomposition but completely eliminates the dependency on a pre-defined h. The core of PrivTree is a novel mechanism that (i) exploits a new analysis on the Laplace distribution and (ii) enables us to use only a constant amount of noise in deciding whether a sub-domain should be split, without worrying about the recursion depth of splitting. We demonstrate the application of PrivTree in modelling spatial data, and show that it can be extended to handle sequence data (where the decision in sub-domain splitting is not based on tuple counts but a more sophisticated measure). Our experiments on a variety of real datasets show that PrivTree considerably outperforms the states of the art in terms of data utility.

2016-07-13
Giulia Fanti, University of Illinois at Urbana-Champaign, Peter Kairouz, University of Illinois at Urbana-Champaign, Sewoong Oh, University of at Urbana-Champaign, Kannan Ramchandra, University of California, Berkeley, Pramod Viswanath, University of Illinois at Urbana-Champaign.  2016.  Rumor Source Obfuscation on Irregular Trees. ACM SIGMETRICS.

Anonymous messaging applications have recently gained popularity as a means for sharing opinions without fear of judgment or repercussion. These messages propagate anonymously over a network, typically de ned by social connections or physical proximity. However, recent advances in rumor source detection show that the source of such an anonymous message can be inferred by certain statistical inference attacks. Adaptive di usion was recently proposed as a solution that achieves optimal source obfuscation over regular trees. However, in real social networks, the degrees difer from node to node, and adaptive di usion can be signicantly sub-optimal. This gap increases as the degrees become more irregular.

In order to quantify this gap, we model the underlying network as coming from standard branching processes with i.i.d. degree distributions. Building upon the analysis techniques from branching processes, we give an analytical characterization of the dependence of the probability of detection achieved by adaptive di usion on the degree distribution. Further, this analysis provides a key insight: passing a rumor to a friend who has many friends makes the source more ambiguous. This leads to a new family of protocols that we call Preferential Attachment Adaptive Di usion (PAAD). When messages are propagated according to PAAD, we give both the MAP estimator for nding the source and also an analysis of the probability of detection achieved by this adversary. The analytical results are not directly comparable, since the adversary's observed information has a di erent distribution under adaptive di usion than under PAAD. Instead, we present results from numerical experiments that suggest that PAAD achieves a lower probability of detection, at the cost of increased communication for coordination.