Biblio
As embedded devices (under the guise of "smart-whatever") rapidly proliferate into many domains, they become attractive targets for malware. Protecting them from software and physical attacks becomes both important and challenging. Remote attestation is a basic tool for mitigating such attacks. It allows a trusted party (verifier) to remotely assess software integrity of a remote, untrusted, and possibly compromised, embedded device (prover). Prior remote attestation methods focus on software (malware) attacks in a one-verifier/one-prover setting. Physical attacks on provers are generally ruled out as being either unrealistic or impossible to mitigate. In this paper, we argue that physical attacks must be considered, particularly, in the context of many provers, e.g., a network, of devices. As- suming that physical attacks require capture and subsequent temporary disablement of the victim device(s), we propose DARPA, a light-weight protocol that takes advantage of absence detection to identify suspected devices. DARPA is resilient against a very strong adversary and imposes minimal additional hardware requirements. We justify and identify DARPA's design goals and evaluate its security and costs.
The traditional text classification methods usually follow this process: first, a sentence can be considered as a bag of words (BOW), then transformed into sentence feature vector which can be classified by some methods, such as maximum entropy (ME), Naive Bayes (NB), support vector machines (SVM), and so on. However, when these methods are applied to text classification, we usually can not obtain an ideal result. The most important reason is that the semantic relations between words is very important for text categorization, however, the traditional method can not capture it. Sentiment classification, as a special case of text classification, is binary classification (positive or negative). Inspired by the sentiment analysis, we use a novel deep learning-based recurrent neural networks (RNNs)model for automatic security audit of short messages from prisons, which can classify short messages(secure and non-insecure). In this paper, the feature of short messages is extracted by word2vec which captures word order information, and each sentence is mapped to a feature vector. In particular, words with similar meaning are mapped to a similar position in the vector space, and then classified by RNNs. RNNs are now widely used and the network structure of RNNs determines that it can easily process the sequence data. We preprocess short messages, extract typical features from existing security and non-security short messages via word2vec, and classify short messages through RNNs which accept a fixed-sized vector as input and produce a fixed-sized vector as output. The experimental results show that the RNNs model achieves an average 92.7% accuracy which is higher than SVM.
Privilege Escalation is a common and serious type of security attack. Although experience shows that many applications are vulnerable to such attacks, attackers rarely succeed upon first trial. Their initial probing attempts often fail before a successful breach of access control is achieved. This paper presents an approach to automatically instrument application source code to report events of failed access attempts that may indicate privilege escalation attacks to a run time application protection mechanism. The focus of this paper is primarily on the problem of instrumenting web application source code to detect access control attack events. We evaluated false positives and negatives of our approach using two open source web applications.
Traditional sensitive data disclosure analysis faces two challenges: to identify sensitive data that is not generated by specific API calls, and to report the potential disclosures when the disclosed data is recognized as sensitive only after the sink operations. We address these issues by developing BidText, a novel static technique to detect sensitive data disclosures. BidText formulates the problem as a type system, in which variables are typed with the text labels that they encounter (e.g., during key-value pair operations). The type system features a novel bi-directional propagation technique that propagates the variable label sets through forward and backward data-flow. A data disclosure is reported if a parameter at a sink point is typed with a sensitive text label. BidText is evaluated on 10,000 Android apps. It reports 4,406 apps that have sensitive data disclosures, with 4,263 apps having log based disclosures and 1,688 having disclosures due to other sinks such as HTTP requests. Existing techniques can only report 64.0% of what BidText reports. And manual inspection shows that the false positive rate for BidText is 10%.
Presented at the NSA Science of Security Quarterly Meeting, July 2016.
Smart environments and security systems require automatic detection of human behaviors including approaching to or departing from an object. Existing human motion detection systems usually require human beings to carry special devices, which limits their applications. In this paper, we present a system called APID to detect arm reaching by analyzing backscatter communication signals from a passive RFID tag on the object. APID does not require human beings to carry any device. The idea is based on the influence of human movements to the vibration of backscattered tag signals. APID is compatible with commodity off-the-shelf devices and the EPCglobal Class-1 Generation-2 protocol. In APID an commercial RFID reader continuously queries tags through emitting RF signals and tags simply respond with their IDs. A USRP monitor passively analyzes the communication signals and reports the approach and departure behaviors. We have implemented the APID system for both single-object and multi-object scenarios in both horizontal and vertical deployment modes. The experimental results show that APID can achieve high detection accuracy.
The concept of differential privacy stems from the study of private query of datasets. In this work, we apply this concept to discrete-time, linear distributed control systems in which agents need to maintain privacy of certain preferences, while sharing information for better system-level performance. The system has N agents operating in a shared environment that couples their dynamics. We show that for stable systems the performance grows as O(T3/Nε2), where T is the time horizon and ε is the differential privacy parameter. Next, we study lower-bounds in terms of the Shannon entropy of the minimal mean square estimate of the system’s private initial state from noisy communications between an agent and the server. We show that for any of noise-adding differentially private mechanism, then the Shannon entropy is at least nN(1−ln(ε/2)), where n is the dimension of the system, and t he lower bound is achieved by a Laplace-noise-adding mechanism. Finally, we study the problem of keeping the objective functions of individual agents differentially private in the context of cloud-based distributed optimization. The result shows a trade-off between the privacy of objective functions and the performance of the distributed optimization algorithm with noise.
Presented at the Joint Trust and Security/Science of Security Seminar, April 26, 2016.
Collaborative filtering plays an essential role in a recommender system, which recommends a list of items to a user by learning behavior patterns from user rating matrix. However, if an attacker has some auxiliary knowledge about a user purchase history, he/she can infer more information about this user. This brings great threats to user privacy. Some methods adopt differential privacy algorithms in collaborative filtering by adding noises to a rating matrix. Although they provide theoretically private results, the influence on recommendation accuracy are not discussed. In this paper, we solve the privacy problem in recommender system in a different way by applying the differential privacy method into the procedure of recommendation. We design two differentially private recommender algorithms with sampling, named Differentially Private Item Based Recommendation with sampling (DP-IR for short) and Differentially Private User Based Recommendation with sampling(DP-UR for short). Both algorithms are based on the exponential mechanism with a carefully designed quality function. Theoretical analyses on privacy of these algorithms are presented. We also investigate the accuracy of the proposed method and give theoretical results. Experiments are performed on real datasets to verify our methods.
Presented at the NSA Science of Security Quarterly Meeting, July 2016.
We propose a distributed and adaptive trust evaluation algorithm (DATEA) to calculate the trust between nodes. First, calculate the communication trust by using the number of data packets between nodes, and predict the trust based on the trend of this value, calculate the comprehensive trust by combining the history trust with the predict value; calculate the energy trust based on the residual energy of nodes; calculate the direct trust by using the communication trust and energy trust. Second, calculate the recommendation trust based on the recommendation reliability and the recommendation familiarity; put forward the adaptively weighting method, and calculate the integrate direct trust by combining the direct trust with recommendation trust. Third, according to the integrate direct trust, considering the factor of trust propagation distance, the indirect trust between nodes is calculated. Simulation experiments show that the proposed algorithm can effectively avoid the attacks of malicious nodes, besides, the calculated direct trust and indirect trust about normal nodes are more conformable to the actual situation.
Querying over encrypted data is gaining increasing popularity in cloud-based data hosting services. Security and efficiency are recognized as two important and yet conflicting requirements for querying over encrypted data. In this article, we propose an efficient private keyword search (EPKS) scheme that supports binary search and extend it to dynamic settings (called DEPKS) for inverted index–based encrypted data. First, we describe our approaches of constructing a searchable symmetric encryption (SSE) scheme that supports binary search. Second, we present a novel framework for EPKS and provide its formal security definitions in terms of plaintext privacy and predicate privacy by modifying Shen et al.’s security notions [Shen et al. 2009]. Third, built on the proposed framework, we design an EPKS scheme whose complexity is logarithmic in the number of keywords. The scheme is based on the groups of prime order and enjoys strong notions of security, namely statistical plaintext privacy and statistical predicate privacy. Fourth, we extend the EPKS scheme to support dynamic keyword and document updates. The extended scheme not only maintains the properties of logarithmic-time search efficiency and plaintext privacy and predicate privacy but also has fewer rounds of communications for updates compared to existing dynamic search encryption schemes. We experimentally evaluate the proposed EPKS and DEPKS schemes and show that they are significantly more efficient in terms of both keyword search complexity and communication complexity than existing randomized SSE schemes.
While event logs generated by business processes play an increasingly significant role in business analysis, the quality of data remains a serious problem. Automatic recovery of dirty event logs is desirable and thus receives more attention. However, existing methods only focus on missing event recovery, or fall short of efficiency. To this end, we present Effa, a ProM plugin, to automatically recover event logs in the light of process specifications. Based on advanced heuristics including process decomposition and trace replaying to search the minimum recovery, Effa achieves a balance between repairing accuracy and efficiency.
This paper introduces a design and implementation of a security scheme for the Internet of Things (IoT) based on ECQV Implicit Certificates and Datagram Transport Layer Security (DTLS) protocol. In this proposed security scheme, Elliptic curve cryptography based ECQV implicit certificate plays a key role allowing mutual authentication and key establishment between two resource-constrained IoT devices. We present how IoT devices get ECQV implicit certificates and use them for authenticated key exchange in DTLS. An evaluation of execution time of the implementation is also conducted to assess the efficiency of the solution.
The emergence and wide availability of remote storage service providers prompted work in the security community that allows clients to verify integrity and availability of the data that they outsourced to a not fully trusted remote storage server at a relatively low cost. Most recent solutions to this problem allow clients to read and update (i.e., insert, modify, or delete) stored data blocks while trying to lower the overhead associated with verifying the integrity of the stored data. In this work, we develop a novel scheme, performance of which favorably compares with the existing solutions. Our solution additionally enjoys a number of new features, such as a natural support for operations on ranges of blocks, revision control, and support for multiple user access to shared content. The performance guarantees that we achieve stem from a novel data structure called a balanced update tree and removing the need for interaction during update operations in addition to communicating the updates themselves.
The Wikidata platform is a crowdsourced, structured knowledgebase aiming to provide integrated, free and language-agnostic facts which are–-amongst others–-used by Wikipedias. Users who actively enter, review and revise data on Wikidata are assisted by a property suggesting system which provides users with properties that might also be applicable to a given item. We argue that evaluating and subsequently improving this recommendation mechanism and hence, assisting users, can directly contribute to an even more integrated, consistent and extensive knowledge base serving a huge variety of applications. However, the quality and usefulness of such recommendations has not been evaluated yet. In this work, we provide the first evaluation of different approaches aiming to provide users with property recommendations in the process of curating information on Wikidata. We compare the approach currently facilitated on Wikidata with two state-of-the-art recommendation approaches stemming from the field of RDF recommender systems and collaborative information systems. Further, we also evaluate hybrid recommender systems combining these approaches. Our evaluations show that the current recommendation algorithm works well in regards to recall and precision, reaching a recall@7 of 79.71% and a precision@7 of 27.97%. We also find that generally, incorporating contextual as well as classifying information into the computation of property recommendations can further improve its performance significantly.
We study the problem of k-anonymization of mail messages in the realistic scenario of auditing mail traffic in a major commercial Web mail service. Mail auditing is necessary in various Web mail debugging and quality assurance activities, such as anti-spam or the qualitative evaluation of novel mail features. It is conducted by trained professionals, often referred to as "auditors", who are shown messages that could expose personally identifiable information. We address here the challenge of k-anonymizing such messages, focusing on machine generated mail messages that represent more than 90% of today's mail traffic. We introduce a novel message signature Mail-Hash, specifically tailored to identifying structurally-similar messages, which allows us to put such messages in a same equivalence class. We then define a process that generates, for each class, masked mail samples that can be shown to auditors, while guaranteeing the k-anonymity of users. The productivity of auditors is measured by the amount of non-hidden mail content they can see every day, while considering normal working conditions, which set a limit to the number of mail samples they can review. In addition, we consider k-anonymity over time since, by definition of k-anonymity, every new release places additional constraints on the assignment of samples. We describe in details the results we obtained over actual Yahoo mail traffic, and thus demonstrate that our methods are feasible at Web mail scale. Given the constantly growing concern of users over their email being scanned by others, we argue that it is critical to devise such algorithms that guarantee k-anonymity, and implement associated processes in order to restore the trust of mail users.
Although computing students may enjoy when their instructors teach using analogies, it is unknown to what extent these analogies are useful for their learning. This study examines the value of analogies when used to introduce three introductory computing topics. The value of these analogies may be evident during the teaching process itself (short term), in subsequent exams (long term), or in students' ability to apply their understanding to related non-technical areas (transfer). Comparing results between an experimental group (analogy) and control group (no analogy), we find potential value for analogies in short term learning. However, no solid evidence was found to support analogies as valuable for students in the long term or for knowledge transfer. Specific demographic groups were examined and promising preliminary findings are presented.
This paper describes a small experimental study into the use of avatars to remediate the lecturer's absence in voice-over-slide material. Four different avatar behaviours are tested. Avatar A performs all the upper-body gestures of the lecturer, which were recorded using a 3D depth sensor. Avatar B is animated using few random gestures in order to create a natural presence that is unrelated to the speech. Avatar C only performs the lecturer's pointing gestures, as these are known to indicate important parts of a lecture. Finally, Avatar D performs "lecturer-like" gestures, but these are desynchronised with the speech. Preliminary results indicate students' preference for Avatars A and C. Although the effect of avatar behaviour on learning did not prove statistically significant, students' comments indicate that an avatar that behaves quietly and only performs some of the lecturer's gestures (pointing) is effective. The paper also presents a simple empirical method for automatically detecting pointing gestures in Kinect recorded lecture data.