Biblio
Darknets, membership-concealing peer-to-peer networks, suffer from high message delivery delays due to insufficient routing strategies. They form topologies restricted to a subgraph of the social network of their users by limiting connections to peers with a mutual trust relationship in real life. Whereas centralized, highly successful social networking services entail a privacy loss of their users, Darknets at higher performance represent an optimal private and censorship-resistant communication substrate for social applications. Decentralized routing so far has been analyzed under the assumption that the network resembles a perfect lattice structure. Freenet, currently the only widely used Darknet, attempts to approximate this structure by embedding the social graph into a metric space. Considering the resulting distortion, the common greedy routing algorithm is adapted to account for local optima. Yet the impact of the adaptation has not been adequately analyzed. We thus suggest a model integrating inaccuracies in the embedding. In the context of this model, we show that the Freenet routing algorithm cannot achieve polylog performance. Consequently, we design NextBestOnce, a provable poylog algorithm based only on information about neighbors. Furthermore, we show that the routing length of NextBestOnce is further decreased by more than a constant factor if neighbor-of-neighbor information is included in the decision process.
Induction is a successful approach for verification of hardware and software systems. A common practice is to model a system using logical formulas, and then use a decision procedure to verify that some logical formula is an inductive safety invariant for the system. A key ingredient in this approach is coming up with the inductive invariant, which is known as invariant inference. This is a major difficulty, and it is often left for humans or addressed by sound but incomplete abstract interpretation. This paper is motivated by the problem of inductive invariants in shape analysis and in distributed protocols. This paper approaches the general problem of inferring first-order inductive invariants by restricting the language L of candidate invariants. Notice that the problem of invariant inference in a restricted language L differs from the safety problem, since a system may be safe and still not have any inductive invariant in L that proves safety. Clearly, if L is finite (and if testing an inductive invariant is decidable), then inferring invariants in L is decidable. This paper presents some interesting cases when inferring inductive invariants in L is decidable even when L is an infinite language of universal formulas. Decidability is obtained by restricting L and defining a suitable well-quasi-order on the state space. We also present some undecidability results that show that our restrictions are necessary. We further present a framework for systematically constructing infinite languages while keeping the invariant inference problem decidable. We illustrate our approach by showing the decidability of inferring invariants for programs manipulating linked-lists, and for distributed protocols.
As the amount of mobile devices populating the Internet keeps growing at tremendous pace, context-aware services have gained a lot of traction thanks to the wide set of potential use cases they can be applied to. Environmental sensing applications, emergency services, and location-aware messaging are just a few examples of applications that are expected to increase in popularity in the next few years. The MobilityFirst future Internet architecture, a clean-slate Internet architecture design, provides the necessary abstractions for creating and managing context-aware services. Starting from these abstractions we design a context services framework, which is based on a set of three fundamental mechanisms: an easy way to specify context based on human understandable techniques, i.e. use of names, an architecture supported management mechanism that allows both to conveniently deploy the service and efficiently provide management capabilities, and a native delivery system that reduces the tax on the network components and on the overhead cost of deploying such applications. In this paper, we present an emergency alert system for vehicles assisting first responders that exploits users location awareness to support quick and reliable alert messages for interested vehicles. By deploying a demo of the system on a nationwide testbed, we aim to provide better understanding of the dynamics involved in our designed framework.
As the amount of mobile devices populating the Internet keeps growing at tremendous pace, context-aware services have gained a lot of traction thanks to the wide set of potential use cases they can be applied to. Environmental sensing applications, emergency services, and location-aware messaging are just a few examples of applications that are expected to increase in popularity in the next few years. The MobilityFirst future Internet architecture, a clean-slate Internet architecture design, provides the necessary abstractions for creating and managing context-aware services. Starting from these abstractions we design a context services framework, which is based on a set of three fundamental mechanisms: an easy way to specify context based on human understandable techniques, i.e. use of names, an architecture supported management mechanism that allows both to conveniently deploy the service and efficiently provide management capabilities, and a native delivery system that reduces the tax on the network components and on the overhead cost of deploying such applications. In this paper, we present an emergency alert system for vehicles assisting first responders that exploits users location awareness to support quick and reliable alert messages for interested vehicles. By deploying a demo of the system on a nationwide testbed, we aim to provide better understanding of the dynamics involved in our designed framework.
Advanced Persistent Threats (APTs) are a new breed of internet based smart threats, which can go undetected with the existing state of-the-art internet traffic monitoring and protection systems. With the evolution of internet and cloud computing, a new generation of smart APT attacks has also evolved and signature based threat detection systems are proving to be futile and insufficient. One of the essential strategies in detecting APTs is to continuously monitor and analyze various features of a TCP/IP connection, such as the number of transferred packets, the total count of the bytes exchanged, the duration of the TCP/IP connections, and details of the number of packet flows. The current threat detection approaches make extensive use of machine learning algorithms that utilize statistical and behavioral knowledge of the traffic. However, the performance of these algorithms is far from satisfactory in terms of reducing false negatives and false positives simultaneously. Mostly, current algorithms focus on reducing false positives, only. This paper presents a fractal based anomaly classification mechanism, with the goal of reducing both false positives and false negatives, simultaneously. A comparison of the proposed fractal based method with a traditional Euclidean based machine learning algorithm (k-NN) shows that the proposed method significantly outperforms the traditional approach by reducing false positive and false negative rates, simultaneously, while improving the overall classification rates.
The Domain Name System (DNS) is a critically fundamental element in the internet technology as it translates domain names into corresponding IP addresses. The DNS queries and responses are UDP (User Datagram Protocol) based. DNS name servers are constantly facing threats of DNS amplification attacks. DNS amplification attack is one of the major Distributed Denial of Service (DDoS) attacks, in DNS. The DNS amplification attack victimized huge business and financial companies and organizations by giving disturbance to the customers. In this paper, a mechanism is proposed to detect such attacks coming from the compromised machines. We analysed DNS traffic packet comparatively based on the Machine Learning Classification algorithms such as Decision Tree (TREE), Multi Layer Perceptron (MLP), Naïve Bayes (NB) and Support Vector Machine (SVM) to classify the DNS traffics into normal and abnormal. In this approach attribute selection algorithms such as Information Gain, Gain Ratio and Chi Square are used to achieve optimal feature subset. In the experimental result it shows that the Decision Tree achieved 99.3% accuracy. This model gives highest accuracy and performance as compared to other Machine Learning algorithms.
Web applications are used on a large scale worldwide, which handles sensitive personal data of users. With web application that maintains data ranging from as simple as telephone number to as important as bank account information, security is a prime point of concern. With hackers aimed to breakthrough this security using various attacks, we are focusing on SQL injection attacks and XSS attacks. SQL injection attack is very common attack that manipulates the data passing through web application to the database servers through web servers in such a way that it alters or reveals database contents. While Cross Site Scripting (XSS) attacks focuses more on view of the web application and tries to trick users that leads to security breach. We are considering three tier web applications with static and dynamic behavior, for security. Static and dynamic mapping model is created to detect anomalies in the class of SQL Injection and XSS attacks.
Steganography is the science of hiding data within data. Either for the good purpose of secret communication or for the bad intention of leaking sensitive confidential data or embedding malicious code or URL. However, many different carrier file formats can be used to hide these data (network, audio, image..etc) but the most common steganography carrier is embedding secret data within images as it is considered to be the best and easiest way to hide all types of files (secret files) within an image using different formats (another image, text, video, virus, URL..etc). To the human eye, the changes in the image appearance with the hidden data can be imperceptible. In fact, images can be more than what we see with our eyes. Therefore, many solutions where proposed to help in detecting these hidden data but each solution have their own strong and weak points either by the limitation of resolving one type of image along with specific hiding technique and or most likely without extracting the hidden data. This paper intends to propose a novel detection approach that will concentrate on detecting any kind of hidden URL in all types of images and extract the hidden URL from the carrier image that used the LSB least significant bit hiding technique.
We consider the following natural generalization of Binary Search: in a given undirected, positively weighted graph, one vertex is a target. The algorithm’s task is to identify the target by adaptively querying vertices. In response to querying a node q, the algorithm learns either that q is the target, or is given an edge out of q that lies on a shortest path from q to the target. We study this problem in a general noisy model in which each query independently receives a correct answer with probability p textgreater 1/2 (a known constant), and an (adversarial) incorrect one with probability 1 − p. Our main positive result is that when p = 1 (i.e., all answers are correct), log2 n queries are always sufficient. For general p, we give an (almost information-theoretically optimal) algorithm that uses, in expectation, no more than (1 − δ) logn/1 − H(p) + o(logn) + O(log2 (1/δ)) queries, and identifies the target correctly with probability at leas 1 − δ. Here, H(p) = −(p logp + (1 − p) log(1 − p)) denotes the entropy. The first bound is achieved by the algorithm that iteratively queries a 1-median of the nodes not ruled out yet; the second bound by careful repeated invocations of a multiplicative weights algorithm. Even for p = 1, we show several hardness results for the problem of determining whether a target can be found using K queries. Our upper bound of log2 n implies a quasipolynomial-time algorithm for undirected connected graphs; we show that this is best-possible under the Strong Exponential Time Hypothesis (SETH). Furthermore, for directed graphs, or for undirected graphs with non-uniform node querying costs, the problem is PSPACE-complete. For a semi-adaptive version, in which one may query r nodes each in k rounds, we show membership in Σ2k−1 in the polynomial hierarchy, and hardness for Σ2k−5.
An increasingly important concern of software engineers is handling uncertainty at runtime. Over the last decade researchers have applied architecture-based self-adaptation approaches to address this concern. However, providing guarantees required by current software systems has shown to be challenging with these approaches. To tackle this challenge, we study the application of control theory to realize self-adaptation and develop novel control-based adaptation mechanisms that guarantee desired system properties. Results are validated on systems with strict requirements.
The combination of (1) hard to eradicate low-level vulnerabilities, (2) a large trusted computing base written in a memory-unsafe language and (3) a desperate need to provide strong software security guarantees, led to the development of protected-module architectures. Such architectures provide strong isolation of protected modules: Security of code and data depends only on a module's own implementation. In this paper we discuss how such protected modules should be written. From an academic perspective it is clear that the future lies with memory-safe languages. Unfortunately, from a business and management perspective, that is a risky path and will remain so in the near future. The use of well-known but memory-unsafe languages such as C and C++ seem inevitable. We argue that the academic world should take another look at the automatic hardening of software written in such languages to mitigate low-level security vulnerabilities. This is a well-studied topic for full applications, but protected-module architectures introduce a new, and much more challenging environment. Porting existing security measures to a protected-module setting without a thorough security analysis may even harm security of the protected modules they try to protect.
In 2013, researchers from the National Security Agency of the USA (NSA) proposed two lightweight block ciphers SIMON and SPECK [3]. While SIMON is tuned for optimal performance in hardware, SPECK is tuned for optimal performance in software. At CHES 2015, Yang et al. [6] combined the "good" design components from both SIMON and SPECK and proposed a new lightweight block cipher SIMECK that is even more compact and efficient. In this paper we show that SIMECK is vulnerable to fault attacks and demonstrate two fault attacks on SIMECK. The first is a random bit-flip fault attack which recovers the n-bit last round key of Simeck using on average about n/2 faults and the second is a more practical, random byte fault attack which recovers the n-bit last round key of SIMECK using on average about n/6.5 faults.
Collaborative filtering plays an essential role in a recommender system, which recommends a list of items to a user by learning behavior patterns from user rating matrix. However, if an attacker has some auxiliary knowledge about a user purchase history, he/she can infer more information about this user. This brings great threats to user privacy. Some methods adopt differential privacy algorithms in collaborative filtering by adding noises to a rating matrix. Although they provide theoretically private results, the influence on recommendation accuracy are not discussed. In this paper, we solve the privacy problem in recommender system in a different way by applying the differential privacy method into the procedure of recommendation. We design two differentially private recommender algorithms with sampling, named Differentially Private Item Based Recommendation with sampling (DP-IR for short) and Differentially Private User Based Recommendation with sampling(DP-UR for short). Both algorithms are based on the exponential mechanism with a carefully designed quality function. Theoretical analyses on privacy of these algorithms are presented. We also investigate the accuracy of the proposed method and give theoretical results. Experiments are performed on real datasets to verify our methods.
As our ground transportation infrastructure modernizes, the large amount of data being measured, transmitted, and stored motivates an analysis of the privacy aspect of these emerging cyber-physical technologies. In this paper, we consider privacy in the routing game, where the origins and destinations of drivers are considered private. This is motivated by the fact that this spatiotemporal information can easily be used as the basis for inferences for a person's activities. More specifically, we consider the differential privacy of the mapping from the amount of flow for each origin-destination pair to the traffic flow measurements on each link of a traffic network. We use a stochastic online learning framework for the population dynamics, which is known to converge to the Nash equilibrium of the routing game. We analyze the sensitivity of this process and provide theoretical guarantees on the convergence rates as well as differential privacy values for these models. We confirm these with simulations on a small example.
Presented at the NSA Science of Security Quarterly Meeting, July 2016.
We present PrivInfer, an expressive framework for writing and verifying differentially private Bayesian machine learning algorithms. Programs in PrivInfer are written in a rich functional probabilistic programming language with constructs for performing Bayesian inference. Then, differential privacy of programs is established using a relational refinement type system, in which refinements on probability types are indexed by a metric on distributions. Our framework leverages recent developments in Bayesian inference, probabilistic programming languages, and in relational refinement types. We demonstrate the expressiveness of PrivInfer by verifying privacy for several examples of private Bayesian inference.
There are two broad approaches for differentially private data analysis. The interactive approach aims at developing customized differentially private algorithms for various data mining tasks. The non-interactive approach aims at developing differentially private algorithms that can output a synopsis of the input dataset, which can then be used to support various data mining tasks. In this paper we study the effectiveness of the two approaches on differentially private k-means clustering. We develop techniques to analyze the empirical error behaviors of the existing interactive and non-interactive approaches. Based on the analysis, we propose an improvement of DPLloyd which is a differentially private version of the Lloyd algorithm. We also propose a non-interactive approach EUGkM which publishes a differentially private synopsis for k-means clustering. Results from extensive and systematic experiments support our analysis and demonstrate the effectiveness of our improvement on DPLloyd and the proposed EUGkM algorithm.