Biblio
The Department of Homeland Security Cyber Security Division (CSD) chose Moving Target Defense as one of the fourteen primary Technical Topic Areas pertinent to securing federal networks and the larger Internet. Moving Target Defense over IPv6 (MT6D) employs an obscuration technique offering keyed access to hosts at a network level without altering existing network infrastructure. This is accomplished through cryptographic dynamic addressing, whereby a new network address is bound to an interface every few seconds in a coordinated manner. The goal of this research is to produce a Register Transfer Level (RTL) network security processor implementation to enable the production of an Application Specific Integrated Circuit (ASIC) variant of MT6D processor for wide deployment. RTL development is challenging in that it must provide system level functions that are normally provided by the Operating System's kernel and supported libraries. This paper presents the architectural design of a hardware engine for MT6D (HE-MT6D) and is complete in simulation. Unique contributions are an inline stream-based network packet processor with a Complex Instruction Set Computer (CISC) architecture, Network Time Protocol listener, and theoretical increased performance over previous software implementations.
With Software Defined Networking (SDN) the control plane logic of forwarding devices, switches and routers, is extracted and moved to an entity called SDN controller, which acts as a broker between the network applications and physical network infrastructure. Failures of the SDN controller inhibit the network ability to respond to new application requests and react to events coming from the physical network. Despite of the huge impact that a controller has on the network performance as a whole, a comprehensive study on its failure dynamics is still missing in the state of the art literature. The goal of this paper is to analyse, model and evaluate the impact that different controller failure modes have on its availability. A model in the formalism of Stochastic Activity Networks (SAN) is proposed and applied to a case study of a hypothetical controller based on commercial controller implementations. In case study we show how the proposed model can be used to estimate the controller steady state availability, quantify the impact of different failure modes on controller outages, as well as the effects of software ageing, and impact of software reliability growth on the transient behaviour.
Vehicular ad hoc networks (VANETs) are taking more attention from both the academia and the automotive industry due to a rapid development of wireless communication technologies. And with this development, vehicles called connected cars are increasingly being equipped with more sensors, processors, storages, and communication devices as they start to provide both infotainment and safety services through V2X communication. Such increase of vehicles is also related to the rise of security attacks and potential security threats. In a vehicular environment, security is one of the most important issues and it must be addressed before VANETs can be widely deployed. Conventional VANETs have some unique characteristics such as high mobility, dynamic topology, and a short connection time. Since an attacker can launch any unexpected attacks, it is difficult to predict these attacks in advance. To handle this problem, we propose collaborative security attack detection mechanism in a software-defined vehicular networks that uses multi-class support vector machine (SVM) to detect various types of attacks dynamically. We compare our security mechanism to existing distributed approach and present simulation results. The results demonstrate that the proposed security mechanism can effectively identify the types of attacks and achieve a good performance regarding high precision, recall, and accuracy.
The collaborative recommendation mechanism is beneficial for the subject in an open network to find efficiently enough referrers who directly interacted with the object and obtain their trust data. The uncertainty analysis to the collected trust data selects the reliable trust data of trustworthy referrers, and then calculates the statistical trust value on certain reliability for any object. After that the subject can judge its trustworthiness and further make a decision about interaction based on the given threshold. The feasibility of this method is verified by three experiments which are designed to validate the model's ability to fight against malicious service, the exaggeration and slander attack. The interactive success rate is significantly improved by using the new model, and the malicious entities are distinguished more effectively than the comparative model.
The size of counterfeiting activities is increasing day by day. These activities are encountered especially in electronics market. In this paper, a countermeasure against counterfeiting on intellectual properties (IP) on Field-Programmable Gate Arrays (FPGA) is proposed. FPGA vendors provide bitstream ciphering as an IP security solution such as battery-backed or non-volatile FPGAs. However, these solutions are secure as long as they can keep decryption key away from third parties. Key storage and key transfer over unsecure channels expose risks for these solutions. In this work, physical unclonable functions (PUFs) have been used for key generation. Generating a key from a circuit in the device solves key transfer problem. Proposed system goes through different phases when it operates. Therefore, partial reconfiguration feature of FPGAs is essential for feasibility of proposed system.
Clean slate design of computing system is an emerging topic for continuing growth of warehouse-scale computers. A famous custom design is rackscale (RS) computing by considering a single rack as a computer that consists of a number of processors, storages and accelerators customized to a target application. In RS, each user is expected to occupy a single or more than one rack. However, new users frequently appear and the users often change their application scales and parameters that would require different numbers of processors, storages and accelerators in a rack. The reconfiguration of interconnection networks on their components is potentially needed to support the above demand in RS. In this context, we propose the inter-rackscale (IRS) architecture that disaggregates various hardware resources into different racks according to their own areas. The heart of IRS is to use free-space optics (FSO) for tightly-coupled connections between processors, storages and GPUs distributed in different racks, by swapping endpoints of FSO links to change network topologies. Through a large IRS system simulation, we show that by utilizing FSO links for interconnection between racks, the FSO-equipped IRS architecture can provide comparable communication latency between heterogeneous resources to that of the counterpart RS architecture. A utilization of 3 FSO terminals per rack can improve at least 87.34% of inter-CPU/SSD(GPU) communication over Fat-tree and improve at least 92.18% of that over 2-D Torus. We verify the advantages of IRS over RS in job scheduling performance.
Abnormality detection is useful in reducing the amount of data to be processed manually by directing attention to the specific portion of data. However, selections of suitable features are important for the success of an abnormality detection system. Designing and selecting appropriate features are time-consuming, requires expensive domain knowledge and human labor. Further, it is very challenging to represent high-level concepts of abnormality in terms of raw input. Most of the existing abnormality detection system use handcrafted feature detector and are based on shallow architecture. In this work, we explore Deep Belief Network for abnormality detection and simultaneously, compared the performance of classic neural network in terms of features learned and accuracy of detecting the abnormality. Further, we explore the set of features learn by each layer of the deep architecture. We also provide a simple and fast mechanism to visualize the feature at the higher layer. Further, the effect of different activation function on abnormality detection is also compared. We observed that deep learning based approach can be used for detecting an abnormality. It has better performance compare to classical neural network in separating distinct as well as almost similar data.
As more and more cyber security incident data ranging from systems logs to vulnerability scan results are collected, manually analyzing these collected data to detect important cyber security events become impossible. Hence, data mining techniques are becoming an essential tool for real-world cyber security applications. For example, a report from Gartner [gartner12] claims that "Information security is becoming a big data analytics problem, where massive amounts of data will be correlated, analyzed and mined for meaningful patterns". Of course, data mining/analytics is a means to an end where the ultimate goal is to provide cyber security analysts with prioritized actionable insights derived from big data. This raises the question, can we directly apply existing techniques to cyber security applications? One of the most important differences between data mining for cyber security and many other data mining applications is the existence of malicious adversaries that continuously adapt their behavior to hide their actions and to make the data mining models ineffective. Unfortunately, traditional data mining techniques are insufficient to handle such adversarial problems directly. The adversaries adapt to the data miner's reactions, and data mining algorithms constructed based on a training dataset degrades quickly. To address these concerns, over the last couple of years new and novel data mining techniques which is more resilient to such adversarial behavior are being developed in machine learning and data mining community. We believe that lessons learned as a part of this research direction would be beneficial for cyber security researchers who are increasingly applying machine learning and data mining techniques in practice. To give an overview of recent developments in adversarial data mining, in this three hour long tutorial, we introduce the foundations, the techniques, and the applications of adversarial data mining to cyber security applications. We first introduce various approaches proposed in the past to defend against active adversaries, such as a minimax approach to minimize the worst case error through a zero-sum game. We then discuss a game theoretic framework to model the sequential actions of the adversary and the data miner, while both parties try to maximize their utilities. We also introduce a modified support vector machine method and a relevance vector machine method to defend against active adversaries. Intrusion detection and malware detection are two important application areas for adversarial data mining models that will be discussed in details during the tutorial. Finally, we discuss some practical guidelines on how to use adversarial data mining ideas in generic cyber security applications and how to leverage existing big data management tools for building data mining algorithms for cyber security.
Organisers of large-scale crowdsourcing initiatives need to consider how to produce outcomes with their projects, but also how to build volunteer capacity. The initial project experience of contributors plays an important role in this, particularly when the contribution process requires some degree of expertise. We propose three analytical dimensions to assess first-time contributor engagement based on readily available public data: cohort analysis, task analysis, and observation of contributor performance. We apply these to a large-scale study of remote mapping activities coordinated by the Humanitarian OpenStreetMap Team, a global volunteer effort with thousands of contributors. Our study shows that different coordination practices can have a marked impact on contributor retention, and that complex task designs can be a deterrent for certain contributor groups. We close by providing recommendations about how to build and sustain volunteer capacity in these and comparable crowdsourcing systems.
Millions of users worldwide resort to mobile VPN clients to either circumvent censorship or to access geo-blocked content, and more generally for privacy and security purposes. In practice, however, users have little if any guarantees about the corresponding security and privacy settings, and perhaps no practical knowledge about the entities accessing their mobile traffic. In this paper we provide a first comprehensive analysis of 283 Android apps that use the Android VPN permission, which we extracted from a corpus of more than 1.4 million apps on the Google Play store. We perform a number of passive and active measurements designed to investigate a wide range of security and privacy features and to study the behavior of each VPN-based app. Our analysis includes investigation of possible malware presence, third-party library embedding, and traffic manipulation, as well as gauging user perception of the security and privacy of such apps. Our experiments reveal several instances of VPN apps that expose users to serious privacy and security vulnerabilities, such as use of insecure VPN tunneling protocols, as well as IPv6 and DNS traffic leakage. We also report on a number of apps actively performing TLS interception. Of particular concern are instances of apps that inject JavaScript programs for tracking, advertising, and for redirecting e-commerce traffic to external partners.
Defending key network infrastructure, such as Internet backbone links or the communication channels of critical infrastructure, is paramount, yet challenging. The inherently complex nature and quantity of network data impedes detecting attacks in real world settings. In this paper, we utilize features of network flows, characterized by their entropy, together with an extended version of the original Replicator Neural Network (RNN) and deep learning techniques to learn models of normality. This combination allows us to apply anomaly-based intrusion detection on arbitrarily large amounts of data and, consequently, large networks. Our approach is unsupervised and requires no labeled data. It also accurately detects network-wide anomalies without presuming that the training data is completely free of attacks. The evaluation of our intrusion detection method, on top of real network data, indicates that it can accurately detect resource exhaustion attacks and network profiling techniques of varying intensities. The developed method is efficient because a normality model can be learned by training an RNN within a few seconds only.
To prevent unauthorized parties from accessing data stored on their smartphones, users have the option of enabling a "lock screen" that requires a secret code (e.g., PIN, drawing a pattern, or biometric) to gain access to their devices. We present a detailed analysis of the smartphone locking mechanisms currently available to billions of smartphone users worldwide. Through a month-long field study, we logged events from a panel of users with instrumented smartphones (N=134). We are able to show how existing lock screen mechanisms provide users with distinct tradeoffs between usability (unlocking speed vs. unlocking frequency) and security. We find that PIN users take longer to enter their codes, but commit fewer errors than pattern users, who unlock more frequently and are very prone to errors. Overall, PIN and pattern users spent the same amount of time unlocking their devices on average. Additionally, unlock performance seemed unaffected for users enabling the stealth mode for patterns. Based on our results, we identify areas where device locking mechanisms can be improved to result in fewer human errors – increasing usability – while also maintaining security.
We present history-independent alternatives to a B-tree, the primary indexing data structure used in databases. A data structure is history independent (HI) if it is impossible to deduce any information by examining the bit representation of the data structure that is not already available through the API. We show how to build a history-independent cache-oblivious B-tree and a history-independent external-memory skip list. One of the main contributions is a data structure we build on the way–-a history-independent packed-memory array (PMA). The PMA supports efficient range queries, one of the most important operations for answering database queries. Our HI PMA matches the asymptotic bounds of prior non-HI packed-memory arrays and sparse tables. Specifically, a PMA maintains a dynamic set of elements in sorted order in a linear-sized array. Inserts and deletes take an amortized O(log2 N) element moves with high probability. Simple experiments with our implementation of HI PMAs corroborate our theoretical analysis. Comparisons to regular PMAs give preliminary indications that the practical cost of adding history-independence is not too large. Our HI cache-oblivious B-tree bounds match those of prior non-HI cache-oblivious B-trees. Searches take O(logB N) I/Os; inserts and deletes take O((log2 N)/B+ logB N) amortized I/Os with high probability; and range queries returning k elements take O(logB N + k/B) I/Os. Our HI external-memory skip list achieves optimal bounds with high probability, analogous to in-memory skip lists: O(logB N) I/Os for point queries and amortized O(logB N) I/Os for inserts/deletes. Range queries returning k elements run in O(logB N + k/B) I/Os. In contrast, the best possible high-probability bounds for inserting into the folklore B-skip list, which promotes elements with probability 1/B, is just Theta(log N) I/Os. This is no better than the bounds one gets from running an in-memory skip list in external memory.
The shift from the host-centric to the information-centric paradigm results in many benefits including native security, enhanced mobility, and scalability. The corresponding information-centric networking (ICN), also presents several important challenges, such as closest replica routing, client privacy, and client preference collection. The majority of these challenges have received the research community’s attention. However, no mechanisms have been proposed for the challenge of effective client preferences collection. In the era of big data analytics and recommender systems customer preferences are essential for providers such as Amazon and Netflix. However, with content served from in-network caches, the ICN paradigm indirectly undermines the gathering of these essential individualized preferences. In this paper, we discuss the requirements for client preference collections and present potential mechanisms that may be used for achieving it successfully.
With the boom of ride-sharing platforms, there has been a growing debate on ride-sharing regulations. In particular, allegations of rape against ride-sharing drivers put sexual assault at the center of this debate. However, there is no systematic and society-wide evidence regarding ride-sharing and sexual assault. Building on a theory of crime victimization, this study examines the effect of ride-sharing on sexual assault incidents using comprehensive data on Uber transactions and crime incidents in New York City over the period from January to March 2015. Our findings demonstrate that the Uber availability is negatively associated with the likelihood of rape, after controlling for endogeneity. Moreover, the deterrent effect of Uber on sexual assault is entirely driven by the taxi-sparse areas, namely outside Manhattan. This study sheds light on the potential of ride-sharing platforms and sharing economy to improve social welfare beyond economic gains.
Power system security is one of the key issues in the operation of smart grid system. Evaluation of power system security is a big challenge considering all the contingencies, due to huge computational efforts involved. Phasor measurement unit plays a vital role in real time power system monitoring and control. This paper presents static security assessment scheme for large scale inter connected power system with Phasor measurement unit using Artificial Neural Network. Voltage magnitude and phase angle are used as input variables of the ANN. The optimal location of PMU under base case and critical contingency cases are determined using Genetic algorithm. The performance of the proposed optimization model was tested with standard IEEE 30 bus system incorporating zero injection buses and successful results have been obtained.
Hypervisors are the main components for managing virtual machines on cloud computing systems. Thus, the security of hypervisors is very crucial as the whole system could be compromised when just one vulnerability is exploited. In this paper, we assess the vulnerabilities of widely used hypervisors including VMware ESXi, Citrix XenServer and KVM using the NIST 800-115 security testing framework. We perform real experiments to assess the vulnerabilities of those hypervisors using security testing tools. The results are evaluated using weakness information from CWE, and using vulnerability information from CVE. We also compute the severity scores using CVSS information. All vulnerabilities found of three hypervisors will be compared in terms of weaknesses, severity scores and impact. The experimental results showed that ESXi and XenServer have common weaknesses and vulnerabilities whereas KVM has fewer vulnerabilities. In addition, we discover a new vulnerability called HTTP response splitting on ESXi Web interface.
Unlike most social media, where automatic archiving of data is the default, Snapchat defaults to ephemerality: deleting content shortly after it is viewed by a receiver. Interviews with 25 Snapchat users show that ephemerality plays a key role in shaping their practices. Along with friend-adding features that facilitate a network of mostly close relations, default deletion affords everyday, mundane talk and reduces self-consciousness while encouraging playful interaction. Further, although receivers can save content through screenshots, senders are notified; this selective saving with notification supports complex information norms that preserve the feel of ephemeral communication while supporting the capture of meaningful content. This dance of giving and taking, sharing and showing, and agency for both senders and receivers provides the basis for a rich design space of mechanisms, levels, and domains for ephemerality.
HDFS has been widely used for storing massive scale data which is vulnerable to site disaster. The file system backup is an important strategy for data retention. In this paper, we present an efficient, easy- to-use Backup and Disaster Recovery System for HDFS. The system includes a client based on HDFS with additional feature of remote backup, and a remote server with a HDFS cluster to keep the backup data. It supports full backup and regularly incremental backup to the server with very low cost and high throughout. In our experiment, the average speed of backup and recovery is up to 95 MB/s, approaching the theoretical maximum speed of gigabit Ethernet.
We propose a new voting scheme, BeleniosRF, that offers both receipt-freeness and end-to-end verifiability. It is receipt-free in a strong sense, meaning that even dishonest voters cannot prove how they voted. We provide a game-based definition of receipt-freeness for voting protocols with non-interactive ballot casting, which we name strong receipt-freeness (sRF). To our knowledge, sRF is the first game-based definition of receipt-freeness in the literature, and it has the merit of being particularly concise and simple. Built upon the Helios protocol, BeleniosRF inherits its simplicity and does not require any anti-coercion strategy from the voters. We implement BeleniosRF and show its feasibility on a number of platforms, including desktop computers and smartphones.
In this study, we report on techniques and analyses that enable us to capture Internet-wide activity at individual IP address-level granularity by relying on server logs of a large commercial content delivery network (CDN) that serves close to 3 trillion HTTP requests on a daily basis. Across the whole of 2015, these logs recorded client activity involving 1.2 billion unique IPv4 addresses, the highest ever measured, in agreement with recent estimates. Monthly client IPv4 address counts showed constant growth for years prior, but since 2014, the IPv4 count has stagnated while IPv6 counts have grown. Thus, it seems we have entered an era marked by increased complexity, one in which the sole enumeration of active IPv4 addresses is of little use to characterize recent growth of the Internet as a whole. With this observation in mind, we consider new points of view in the study of global IPv4 address activity. Our analysis shows significant churn in active IPv4 addresses: the set of active IPv4 addresses varies by as much as 25% over the course of a year. Second, by looking across the active addresses in a prefix, we are able to identify and attribute activity patterns to networkm restructurings, user behaviors, and, in particular, various address assignment practices. Third, by combining spatio-temporal measures of address utilization with measures of traffic volume, and sampling-based estimates of relative host counts, we present novel perspectives on worldwide IPv4 address activity, including empirical observation of under-utilization in some areas, and complete utilization, or exhaustion, in others.
Conventional security methods like password and ID card methods are now rapidly replacing by biometrics for identification of a person. Biometrics uses physiological or behavioral characteristics of a person. Usage of biometric raises critical privacy and security concerns that, due to the noisy nature of biometrics, cannot be addressed using standard cryptographic methods. The loss of an enrollment biometric to an attacker is a security hazard because it may allow the attacker to get an unauthorized access to the system. Biometric template can be stolen and intruder can get access of biometric system using fake input. Hence, it becomes essential to design biometric system with secure template or if the biometric template in an application is compromised, the biometric signal itself is not lost forever and a new biometric template can be issued. One way is to combine the biometrics and cryptography or use transformed data instead of original biometric template. But traditional cryptography methods are not useful in biometrics because of intra-class variation. Biometric cryptosystem can apply fuzzy vault, fuzzy commitment, helper data and secure sketch, whereas, cancelable biometrics uses distorting transforms, Bio-Hashing, and Bio-Encoding techniques. In this paper, biometric cryptosystem is presented with fuzzy vault and fuzzy commitment techniques for fingerprint recognition system.
In this paper we provide a framework to analyze the effect of uniform sampling on graph optimization problems. Interestingly, we apply this framework to a general class of graph optimization problems that we call heavy subgraph problems, and show that uniform sampling preserves a 1-ε approximate solution to these problems. This class contains many interesting problems such as densest subgraph, directed densest subgraph, densest bipartite subgraph, d-max cut, and d-sum-max clustering. As an immediate impact of this result, one can use uniform sampling to solve these problems in streaming, turnstile or Map-Reduce settings. Indeed, our results by characterizing heavy subgraph problems address Open Problem 13 at the IITK Workshop on Algorithms for Data Streams in 2006 regarding the effects of subsampling, in the context of graph streams. Recently Bhattacharya et al. in STOC 2015 provide the first one pass algorithm for the densest subgraph problem in the streaming model with additions and deletions to its edges, i.e., for dynamic graph streams. They present a (0.5-ε)-approximation algorithm using \textasciitildeO(n) space, where factors of ε and log(n) are suppressed in the \textasciitildeO notation. In this paper we improve the (0.5-ε)-approximation algorithm of Bhattacharya et al. by providing a (1-ε)-approximation algorithm using \textasciitildeO(n) space.
With the use of external cloud services such as Google Docs or Evernote in an enterprise setting, the loss of control over sensitive data becomes a major concern for organisations. It is typical for regular users to violate data disclosure policies accidentally, e.g. when sharing text between documents in browser tabs. Our goal is to help such users comply with data disclosure policies: we want to alert them about potentially unauthorised data disclosure from trusted to untrusted cloud services. This is particularly challenging when users can modify data in arbitrary ways, they employ multiple cloud services, and cloud services cannot be changed. To track the propagation of text data robustly across cloud services, we introduce imprecise data flow tracking, which identifies data flows implicitly by detecting and quantifying the similarity between text fragments. To reason about violations of data disclosure policies, we describe a new text disclosure model that, based on similarity, associates text fragments in web browsers with security tags and identifies unauthorised data flows to untrusted services. We demonstrate the applicability of imprecise data tracking through BrowserFlow, a browser-based middleware that alerts users when they expose potentially sensitive text to an untrusted cloud service. Our experiments show that BrowserFlow can robustly track data flows and manage security tags for documents with no noticeable performance impact.
In the Internet of Things (IoT), Internet-connected things provide an influx of data and resources that offer unlimited possibility for applications and services. Smart City IoT systems refer to the things that are distributed over wide physical areas covering a whole city. While the new breed of data and resources looks promising, building applications in such large scale IoT systems is a difficult task due to the distributed and dynamic natures of entities involved, such as sensing, actuating devices, people and computing resources. In this paper, we explore the process of developing Smart City IoT applications from a coordination-based perspective. We show that a distributed coordination model that oversees such a large group of distributed components is necessary in building Smart City IoT applications. In particular, we propose Adaptive Distributed Dataflow, a novel Dataflow-based programming model that focuses on coordinating city-scale distributed systems that are highly heterogeneous and dynamic.