Biblio
Wake locks are widely used in Android apps to protect critical computations from being disrupted by device sleeping. Inappropriate use of wake locks often seriously impacts user experience. However, little is known on how wake locks are used in real-world Android apps and the impact of their misuses. To bridge the gap, we conducted a large-scale empirical study on 44,736 commercial and 31 open-source Android apps. By automated program analysis and manual investigation, we observed (1) common program points where wake locks are acquired and released, (2) 13 types of critical computational tasks that are often protected by wake locks, and (3) eight patterns of wake lock misuses that commonly cause functional and non-functional issues, only three of which had been studied by existing work. Based on our findings, we designed a static analysis technique, Elite, to detect two most common patterns of wake lock misuses. Our experiments on real-world subjects showed that Elite is effective and can outperform two state-of-the-art techniques.
We study the problem of k-anonymization of mail messages in the realistic scenario of auditing mail traffic in a major commercial Web mail service. Mail auditing is necessary in various Web mail debugging and quality assurance activities, such as anti-spam or the qualitative evaluation of novel mail features. It is conducted by trained professionals, often referred to as "auditors", who are shown messages that could expose personally identifiable information. We address here the challenge of k-anonymizing such messages, focusing on machine generated mail messages that represent more than 90% of today's mail traffic. We introduce a novel message signature Mail-Hash, specifically tailored to identifying structurally-similar messages, which allows us to put such messages in a same equivalence class. We then define a process that generates, for each class, masked mail samples that can be shown to auditors, while guaranteeing the k-anonymity of users. The productivity of auditors is measured by the amount of non-hidden mail content they can see every day, while considering normal working conditions, which set a limit to the number of mail samples they can review. In addition, we consider k-anonymity over time since, by definition of k-anonymity, every new release places additional constraints on the assignment of samples. We describe in details the results we obtained over actual Yahoo mail traffic, and thus demonstrate that our methods are feasible at Web mail scale. Given the constantly growing concern of users over their email being scanned by others, we argue that it is critical to devise such algorithms that guarantee k-anonymity, and implement associated processes in order to restore the trust of mail users.
We live in the era of mobile computing. Mobile devices have more sensors and more capabilities than desktop computers. For any computing device that contains sensitive information and accesses the Internet, security is a major concern for both enterprises and end-users. Of the mobile devices commonly in The emphasis of this research focuses on to the ways in which the popular iOS and Android platforms handle permissions in an attempt to discern if there are any identifiable trends on either platform w.r.t. applications being over- or underprivileged.
In this paper, we present a multi-featured supervised automatic keyword extraction system. We extracted salient semantic features which are descriptive of candidate keyphrases, a Random Forest classifier was used for training. The system achieved an accuracy of 58.3 % precision and has shown to outperform two top performing systems when benchmarked on a crowdsourced dataset. Furthermore, our approach achieved a personal best Precision and F-measure score of 32.7 and 25.5 respectively on the Semeval Keyphrase extraction challenge dataset. The paper describes the approaches used as well as the result obtained.
The main goal of this work is to create a model of trust which can be considered as a reference for developing applications oriented on collaborative annotation. Such a model includes design parameters inferred from online communities operated on collaborative content. This study aims to create a static model, but it could be dynamic or more than one model depending on the context of an application. An analysis on Genius as a peer production community was done to understand user behaviors. This study characterizes user interactions based on the differentiation between Lightweight Peer Production (LWPP) and Heavyweight Peer Production (HWPP). It was found that more LWPP- interactions take place in the lower levels of this system. As the level in the role system increases, there will be more HWPP-interactions. This can be explained as LWPP-interacions are straightforward, while HWPP-interations demand more agility by the user. These provide more opportunities and therefore attract other users for further interactions.
When running large human computation tasks in the real-world, honeypots play an important role for assessing the overall quality of the work produced. The generation of such honeypots can be a significant burden on the task owner as they require specific characteristics in their design and implementation and continuous maintenance when operating data pipelines that include a human computation component. In this extended abstract we outline a novel approach for creating honeypots using automatically generated questions from a reference knowledge base with the ability to control such parameters as topic and difficulty.
Signed social networks have become increasingly important in recent years because of the ability to model trust-based relationships in review sites like Slashdot, Epinions, and Wikipedia. As a result, many traditional network mining problems have been re-visited in the context of networks in which signs are associated with the links. Examples of such problems include community detection, link prediction, and low rank approximation. In this paper, we will examine the problem of ranking nodes in signed networks. In particular, we will design a ranking model, which has a clear physical interpretation in terms of the sign of the edges in the network. Specifically, we propose the Troll-Trust model that models the probability of trustworthiness of individual data sources as an interpretation for the underlying ranking values. We will show the advantages of this approach over a variety of baselines.
Given "who-trusts/distrusts-whom" information, how can we propagate the trust and distrust? With the appearance of fraudsters in social network sites, the importance of trust prediction has increased. Most such methods use only explicit and implicit trust information (e.g., if Smith likes several of Johnson's reviews, then Smith implicitly trusts Johnson), but they do not consider distrust. In this paper, we propose PIN-TRUST, a novel method to handle all three types of interaction information: explicit trust, implicit trust, and explicit distrust. The novelties of our method are the following: (a) it is carefully designed, to take into account positive, implicit, and negative information, (b) it is scalable (i.e., linear on the input size), (c) most importantly, it is effective and accurate. Our extensive experiments with a real dataset, Epinions.com data, of 100K nodes and 1M edges, confirm that PIN-TRUST is scalable and outperforms existing methods in terms of prediction accuracy, achieving up to 50.4 percentage relative improvement.
In this study, the trusted third party (TTP) in Australia's B2C marketplace is studied and the factors influencing consumers' trust behaviour are examined from the perspective of consumers' online trust. Based on the literature review and combined with the development status and background of Australia's e-commerce, underpinned by the Theory of Planned Behaviour (TPB) and a conceptual trust model, this paper expatiates the specific factors and influence mechanism of TTP on consumers' trust behaviour. Also this paper explains two different functions of TTP to solve the online trust problem faced by consumers. Meanwhile, this paper summarizes five different types of services provided by TTPs during the establishment of the trust relationship. Finally, the present study selects 100 B2C enterprises by the simple random sampling method and makes a detailed analysis of their TTPs, to verify the services and functions of the proposed TTP in the trust model. This study is of some significance for comprehending the influence mechanism, functions and services of TTPs on consumers' trust behaviour in the realistic Australian B2C environment.
Trust plays an important role in various user-facing systems and applications. It is particularly important in the context of decision support systems, where the system's output serves as one of the inputs for the users' decision making processes. In this work, we study the dynamics of explicit and implicit user trust in a simulated automated quality monitoring system, as a function of the system accuracy. We establish that users correctly perceive the accuracy of the system and adjust their trust accordingly.
To adapt to the rapidly evolving landscape of cyber threats, security professionals are actively exchanging Indicators of Compromise (IOC) (e.g., malware signatures, botnet IPs) through public sources (e.g. blogs, forums, tweets, etc.). Such information, often presented in articles, posts, white papers etc., can be converted into a machine-readable OpenIOC format for automatic analysis and quick deployment to various security mechanisms like an intrusion detection system. With hundreds of thousands of sources in the wild, the IOC data are produced at a high volume and velocity today, which becomes increasingly hard to manage by humans. Efforts to automatically gather such information from unstructured text, however, is impeded by the limitations of today's Natural Language Processing (NLP) techniques, which cannot meet the high standard (in terms of accuracy and coverage) expected from the IOCs that could serve as direct input to a defense system. In this paper, we present iACE, an innovation solution for fully automated IOC extraction. Our approach is based upon the observation that the IOCs in technical articles are often described in a predictable way: being connected to a set of context terms (e.g., "download") through stable grammatical relations. Leveraging this observation, iACE is designed to automatically locate a putative IOC token (e.g., a zip file) and its context (e.g., "malware", "download") within the sentences in a technical article, and further analyze their relations through a novel application of graph mining techniques. Once the grammatical connection between the tokens is found to be in line with the way that the IOC is commonly presented, these tokens are extracted to generate an OpenIOC item that describes not only the indicator (e.g., a malicious zip file) but also its context (e.g., download from an external source). Running on 71,000 articles collected from 45 leading technical blogs, this new approach demonstrates a remarkable performance: it generated 900K OpenIOC items with a precision of 95% and a coverage over 90%, which is way beyond what the state-of-the-art NLP technique and industry IOC tool can achieve, at a speed of thousands of articles per hour. Further, by correlating the IOCs mined from the articles published over a 13-year span, our study sheds new light on the links across hundreds of seemingly unrelated attack instances, particularly their shared infrastructure resources, as well as the impacts of such open-source threat intelligence on security protection and evolution of attack strategies.
This paper presents a framework for (1) generating variants of known attacks, (2) replaying attack variants in an isolated environment and, (3) validating detection capabilities of attack detection techniques against the variants. Our framework facilitates reproducible security experiments. We generated 648 variants of three real-world attacks (observed at the National Center for Supercomputing Applications at the University of Illinois). Our experiment showed the value of generating attack variants by quantifying the detection capabilities of three detection methods: a signature-based detection technique, an anomaly-based detection technique, and a probabilistic graphical model-based technique.
Kernel rootkits often hide associated malicious processes by altering reported task struct information to upper layers and applications such as ps and top. Virtualized settings offer a unique opportunity to mitigate this behavior using dynamic virtual machine introspection (VMI). For known kernels, VMI can be deployed to search for kernel objects and identify them by using unique data structure "signatures". In existing work, VMI-detected data structure signatures are based on values and structural features which must be (often exactly) present in memory snapshots taken, for accurate detection. This features a certain brittleness and rootkits can escape detection by simply temporarily "un-tangling" the corresponding structures when not running. Here we introduce a new paradigm, that defeats such behavior by training for and observing signatures of timing access patterns to any and all kernel-mapped data regions, including objects that are not directly linked in the "official" list of tasks. The use of timing information in training detection signatures renders the defenses resistant to attacks that try to evade detection by removing their corresponding malicious processes before scans. KXRay successfully detected processes hidden by four traditional rootkits.
Attribute-based methods provide authorization to parties based on whether their set of attributes (e.g., age, organization, etc.) fulfills a policy. In attribute-based encryption (ABE), authorized parties can decrypt, and in attribute-based credentials (ABCs), authorized parties can authenticate themselves. In this paper, we combine elements of ABE and ABCs together with garbled circuits to construct attribute-based key exchange (ABKE). Our focus is on an interactive solution involving a client that holds a certificate (issued by an authority) vouching for that client's attributes and a server that holds a policy computable on such a set of attributes. The goal is for the server to establish a shared key with the client but only if the client's certified attributes satisfy the policy. Our solution enjoys strong privacy guarantees for both the client and the server, including attribute privacy and unlinkability of client sessions. Our main contribution is a construction of ABKE for arbitrary circuits with high (concrete) efficiency. Specifically, we support general policies expressible as boolean circuits computed on a set of attributes. Even for policies containing hundreds of thousands of gates the performance cost is dominated by two pairing computations per policy input. Put another way, for a similar cost to prior ABE/ABC solutions, which can only support small formulas efficiently, we can support vastly richer policies. We implemented our solution and report on its performance. For policies with 100,000 gates and 200 inputs over a realistic network, the server and client spend 957 ms and 176 ms on computation, respectively. When using offline preprocessing and batch signature verification, this drops to only 243 ms and 97 ms.
With the emergence of the internet of things (IoT) and participatory sensing (PS) paradigms trustworthiness of remotely sensed data has become a vital research question. In this work, we present the design of a trusted sensor, which uses physically unclonable functions (PUFs) as anchor to ensure integrity, authenticity and non-repudiation guarantees on the sensed data. We propose trusted sensors for mobile devices to address the problem of potential manipulation of mobile sensors' readings by exploiting vulnerabilities of mobile device OS in participatory sensing for IoT applications. Preliminary results from our implementation of trusted visual sensor node show that the proposed security solution can be realized without consuming significant amount of resources of the sensor node.
The Cramer-Shoup cryptosystem has attracted much attention from the research community, mainly due to its efficiency in encryption/decryption, as well as the provable reductions of security against adaptively chosen ciphertext attacks in the standard model. At TCC 2005, Vasco et al. proposed a method for building Cramer-Shoup like cryptosystem over non-abelian groups and raised an open problem for finding a secure instantiation. Based on this work, we present another general framework for constructing Cramer-Shoup like cryptosystems. We firstly propose the concept of index exchangeable family (IEF) and an abstract construction of Cramer-Shoup like encryption scheme over IEF. The concrete instantiations of IEF are then derived from some reasonable hardness assumptions over abelian groups as well as non-abelian groups, respectively. These instantiations ultimately lead to simple yet efficient constructions of Cramer-Shoup like cryptosystems, including new non-abelian analogies that can be potential solutions to Vasco et al.'s open problem. Moreover, we propose a secure outsourcing method for the encryption of the non-abelian analog based on the factorization problem over non-commutative groups. The experiments clearly indicate that the computational cost of our outsourcing scheme can be significantly reduced thanks to the load sharing with cloud datacenter servers.
Anonymous Identity-Based Broadcast Encryption (AIBBE) allows a sender to broadcast a ciphertext to multi-receivers, and keeps receivers' anonymity. The existing AIBBE schemes fail to achieve efficient decryption or strong security, like the constant decryption complexity, the security under the adaptive attack, or the security in the standard model. Hence, we propose two new AIBBE schemes to overcome the drawbacks of previous schemes in the state-of-art. The biggest contribution in our work is the proposed AIBBE scheme with constant decryption complexity and the provable security under the adaptive attack in the standard model. This scheme should be the first one to obtain advantages in all above mentioned aspects, and has sufficient contribution in theory due to its strong security. We also propose another AIBBE scheme in the Random Oracle (RO) model, which is of sufficient interest in practice due to our experiment.
We study a sensor network setting in which samples are encrypted individually using different keys and maintained on a cloud storage. For large systems, e.g. those that generate several millions of samples per day, fine-grained sharing of encrypted samples is challenging. Existing solutions, such as Attribute-Based Encryption (ABE) and Key Aggregation Cryptosystem (KAC), can be utilized to address the challenge, but only to a certain extent. They are often computationally expensive and thus unlikely to operate at scale. We propose an algorithmic enhancement and two heuristics to improve KAC's key reconstruction cost, while preserving its provable security. The improvement is particularly significant for range and down-sampling queries – accelerating the reconstruction cost from quadratic to linear running time. Experimental study shows that for queries of size 32k samples, the proposed fast reconstruction techniques speed-up the original KAC by at least 90 times on range and down-sampling queries, and by eight times on general (arbitrary) queries. It also shows that at the expense of splitting the query into 16 sub-queries and correspondingly issuing that number of different aggregated keys, reconstruction time can be reduced by 19 times. As such, the proposed techniques make KAC more applicable in practical scenarios such as sensor networks or the Internet of Things.
The advancing of reverse engineering techniques has complicated the efforts in intellectual property protection. Proactive methods have been developed recently, among which layout-level IC camouflaging is the leading example. However, existing camouflaging methods are rarely supported by provably secure criteria, which further leads to over-estimation of the security level when countering the latest de-camouflaging attacks, e.g., the SAT-based attack. In this paper, a quantitative security criterion is proposed for de-camouflaging complexity measurements and formally analyzed through the demonstration of the equivalence between the existing de-camouflaging strategy and the active learning scheme. Supported by the new security criterion, two novel camouflaging techniques are proposed, the low-overhead camouflaging cell library and the AND-tree structure, to help achieve exponentially increasing security levels at the cost of linearly increasing performance overhead on the circuit under protection. A provably secure camouflaging framework is then developed by combining these two techniques. Experimental results using the security criterion show that the camouflaged circuits with the proposed framework are of high resilience against the SAT-based attack with negligible performance overhead.
We present the first complexity-theoretic secure steganographic protocol which, for any communication channel, is provably secure, reliable, and has nearly optimal bandwidth. Our system is unconditionally secure, i.e. our proof does not rely on any unproven complexity-theoretic assumption, like e.g. the existence of one-way functions. This disproves the claim that the existence of one-way functions and access to a communication channel oracle are both necessary and sufficient conditions for the existence of secure steganography, in the sense that secure and reliable steganography exists independently of the existence of one-way functions.
Anonymous authentication allows one to authenticate herself without revealing her identity, and becomes an important technique for constructing privacy-preserving Internet connections. Anonymous password authentication is highly desirable as it enables a client to authenticate herself by a human-memorable password while preserving her privacy. In this paper, we introduce a novel approach for designing anonymous password-authenticated key exchange (APAKE) protocols using algebraic message authentication codes (MACs), where an algebraic MAC wrapped by a password is used by a client for anonymous authentication, and a server issues algebraic MACs to clients and acts as the verifier of login protocols. Our APAKE construction is secure provided that the algebraic MAC is strongly existentially unforgeable under random message and chosen verification queries attack (suf-rmva), weak pseudorandom and tag-randomization simulatable, and has simulation-sound extractable non-interactive zero-knowledge proofs (SE-NIZKs). To design practical APAKE protocols, we instantiate an algebraic MAC based on the q-SDH assumption which satisfies all the required properties, and construct credential presentation algorithms for the MAC which have optimal efficiency for a randomize-then-prove paradigm. Based on the algebraic MAC, we instantiate a highly practical APAKE protocol and denote it by APAKE, which is much more efficient than the mechanisms specified by ISO/IEC 20009-4. An efficient revocation mechanism for APAKE is also proposed.
We integrate APAKE into TLS to present an anonymous client authentication mode where clients holding passwords can authenticate themselves to a server anonymously. Our implementation with 128-bit security shows that the average connection time of APAKE-based ciphersuite is 2.8 ms. With APAKE integrated into the OpenSSL library and using an Apache web server on a 2-core desktop computer, we could serve 953 ECDHE-ECDSA-AES128-GCM-SHA256 HTTPS connections per second for a 10 KB payload. Compared to ECDSA-signed elliptic curve Diffie-Hellman ciphersuite with mutual authentication, this means a 0.27 KB increased handshake size and a 13% reduction in throughput.
When a group of individuals and organizations wish to compute a stable matching–-for example, when medical students are matched to medical residency programs–-they often outsource the computation to a trusted arbiter in order to preserve the privacy of participants' preferences. Secure multi-party computation offers the possibility of private matching processes that do not rely on any common trusted third party. However, stable matching algorithms have previously been considered infeasible for execution in a secure multi-party context on non-trivial inputs because they are computationally intensive and involve complex data-dependent memory access patterns. We adapt the classic Gale-Shapley algorithm for use in such a context, and show experimentally that our modifications yield a lower asymptotic complexity and more than an order of magnitude in practical cost improvement over previous techniques. Our main improvements stem from designing new oblivious data structures that exploit the properties of the matching algorithms. We apply a similar strategy to scale the Roth-Peranson instability chaining algorithm, currently in use by the National Resident Matching Program. The resulting protocol is efficient enough to be useful at the scale required for matching medical residents nationwide, taking just over 18 hours to complete an execution simulating the 2016 national resident match with more than 35,000 participants and 30,000 residency slots.
In this paper, we propose a new risk analysis framework that enables to supervise risks in complex and distributed systems. Our contribution is twofold. First, we provide the Risk Assessment Graphs (RAGs) as a model of risk analysis. This graph-based model is adaptable to the system changes over the time. We also introduce the potentiality and the accessibility functions which, during each time slot, evaluate respectively the chance of exploiting the RAG's nodes, and the connection time between these nodes. In addition, we provide a worst-case risk evaluation approach, based on the assumption that the intruder threats usually aim at maximising their benefits by inflicting the maximum damage to the target system (i.e. choosing the most likely paths in the RAG). We then introduce three security metrics: the propagated risk, the node risk and the global risk. We illustrate the use of our framework through the simple example of an enterprise email service. Our framework achieves both flexibility and generality requirements, it can be used to assess the external threats as well as the insider ones, and it applies to a wide set of applications.
Motivated by the impossibility of achieving fairness in secure computation [Cleve, STOC 1986], recent works study a model of fairness in which an adversarial party that aborts on receiving output is forced to pay a mutually predefined monetary penalty to every other party that did not receive the output. These works show how to design protocols for secure computation with penalties that tolerate an arbitrary number of corruptions. In this work, we improve the efficiency of protocols for secure computation with penalties in a hybrid model where parties have access to the "claim-or-refund" transaction functionality. Our first improvement is for the ladder protocol of Bentov and Kumaresan (Crypto 2014) where we improve the dependence of the script complexity of the protocol (which corresponds to miner verification load and also space on the blockchain) on the number of parties from quadratic to linear (and in particular, is completely independent of the underlying function). Our second improvement is for the see-saw protocol of Kumaresan et al. (CCS 2015) where we reduce the total number of claim-or-refund transactions and also the script complexity from quadratic to linear in the number of parties. We also present a 'dual-mode' protocol that offers different guarantees depending on the number of corrupt parties: (1) when s