Biblio
Obtaining frequency information of data streams, in limited space, is a well-recognized problem in literature. A number of recent practical applications (such as those in computational advertising) require temporally-aware solutions: obtaining historical count statistics for both time-points as well as time-ranges. In these scenarios, accuracy of estimates is typically more important for recent instances than for older ones; we call this desirable property Time Adaptiveness. With this observation, [20] introduced the Hokusai technique based on count-min sketches for estimating the frequency of any given item at any given time. The proposed approach is problematic in practice, as its memory requirements grow linearly with time, and it produces discontinuities in the estimation accuracy. In this work, we describe a new method, Time-adaptive Sketches, (Ada-sketch), that overcomes these limitations, while extending and providing a strict generalization of several popular sketching algorithms. The core idea of our method is inspired by the well-known digital Dolby noise reduction procedure that dates back to the 1960s. The theoretical analysis presented could be of independent interest in itself, as it provides clear results for the time-adaptive nature of the errors. An experimental evaluation on real streaming datasets demonstrates the superiority of the described method over Hokusai in estimating point and range queries over time. The method is simple to implement and offers a variety of design choices for future extensions. The simplicity of the procedure and the method's generalization of classic sketching techniques give hope for wide applicability of Ada-sketches in practice.
Hashing has shown its efficiency and effectiveness in facilitating large-scale multimedia applications. Supervised knowledge (\textbackslashemph\e.g.\, semantic labels or pair-wise relationship) associated to data is capable of significantly improving the quality of hash codes and hash functions. However, confronted with the rapid growth of newly-emerging concepts and multimedia data on the Web, existing supervised hashing approaches may easily suffer from the scarcity and validity of supervised information due to the expensive cost of manual labelling. In this paper, we propose a novel hashing scheme, termed \textbackslashemph\zero-shot hashing\ (ZSH), which compresses images of "unseen" categories to binary codes with hash functions learned from limited training data of "seen" categories. Specifically, we project independent data labels (i.e., 0/1-form label vectors) into semantic embedding space, where semantic relationships among all the labels can be precisely characterized and thus seen supervised knowledge can be transferred to unseen classes. Moreover, in order to cope with the semantic shift problem, we rotate the embedded space to more suitably align the embedded semantics with the low-level visual feature space, thereby alleviating the influence of semantic gap. In the meantime, to exert positive effects on learning high-quality hash functions, we further propose to preserve local structural property and discrete nature in binary codes. Besides, we develop an efficient alternating algorithm to solve the ZSH model. Extensive experiments conducted on various real-life datasets show the superior zero-shot image retrieval performance of ZSH as compared to several state-of-the-art hashing methods.
With the rapid increasing IPv6 network traffic, some network process systems like DPI and firewall cannot meet the demand of high network bandwidth. Flow table based on hash is one of the bottlenecks. In this paper, we measure the characteristics of IPv6 address and propose an entropy based revision hash algorithm, which can produce a better distribution within acceptable time. Moreover, we use a hierarchical hash strategy to reduce hash table lookup times further more even in extreme cases.
In this paper, we propose ParaRegex, a novel approach for fast parallel regular expression matching. ParaRegex is a framework that implements data-parallel regular expression matching for deterministic finite automaton based methods. Experimental evaluation shows that ParaRegex produces a fast matching engine with speeds of up to 6 times compared to sequential implementations on a commodity 8-thread workstation.
To match the signatures of malicious traffic across packet boundaries, network-intrusion detection (and prevention) systems (NIDS) typically perform pattern matching after flow reassembly or packet reordering. However, this may lead to the need for large packet buffers, making detection vulnerable to denial-of-service (DoS) attacks, whereby attackers exhaust the buffer capacity by sending long sequences of out-of-order packets. While researchers have proposed solutions for exact-match patterns, regular-expression matching on out-of-order packets is still an open problem. Specifically, a key challenge is the matching of complex sub-patterns (such as repetitions of wildcards matched at the boundary between packets). Our proposed approach leverages the insight that various segments matching the same repetitive sub-pattern are logically equivalent to the regular-expression matching engine, and thus, inter-changing them would not affect the final result. In this paper, we present O3FA, a new finite automata-based, deep packet-inspection engine to perform regular-expression matching on out-of-order packets without requiring flow reassembly. O3FA consists of a deterministic finite automaton (FA) coupled with a set of prefix-/suffix-FA, which allows processing out-of-order packets on the fly. We present our design, optimization, and evaluation for the O3FA engine. Our experiments show that our design requires 20x-4000x less buffer space than conventional buffering-and-reassembling schemes on various datasets and that it can process packets in real-time, i.e., without reassembly.
The increasing complexity of cyber-attacks necessitates the design of more efficient hardware architectures for real-time Intrusion Detection Systems (IDSs). String matching is the main performance-demanding component of an IDS. An effective technique to design high-performance string matching engines is to partition the target set of strings into multiple subgroups and to use a parallel string matching hardware unit for each subgroup. This paper introduces a novel pattern grouping algorithm for heterogeneous bit-split string matching architectures. The proposed algorithm presents a reliable method to estimate the correlation between strings. The correlation factors are then used to find a preferred group for each string in a seed growing approach. Experimental results demonstrate that the proposed algorithm achieves an average of 41% reduction in memory consumption compared to the best existing approach found in the literature, while offering orders of magnitude faster execution time compared to an exhaustive search.
In this paper, we analyze the performance and cost trade-off from selecting two representations of nodes when implementing the Aho-Corasick algorithm. This algorithm can be used for pattern matching in network-based intrusion detection systems such as Snort. Our analysis uses the Snort 2.9.7 rules set, which contains almost 26k patterns. Our methodology consists of code profiling and analysis, followed by the selection of a parameter to maximize a metric that combines clock cycles count and memory usage. The parameter determines which of two types of nodes is selected for each trie node. We show that it is possible to select the parameter to optimize the metric, which results in an improvement by up to 12× compared with the single node-type case.
This study aims to identify and quantify applications that are making use of port numbers that are typically associated with other major Internet applications (i.e. port 53, 80, 123, 443, 8000 and 8080) to bypass port-based traffic controls such as firewalls. We use lightweight packet inspection to examine each flow observed using these ports on our campus network over the course of a week in September 2015 and identify applications that are producing network traffic that does not match the expected application for each port. We find that there are numerous programs that co-opt the port numbers of major Internet applications on our campus, many of which are Chinese in origin and are not recognized by existing traffic classification tools. As a result of our investigation, new rules for identifying over 20 new applications have been made available to the research community.
Deep packet inspection (DPI) is widely used in content-aware network applications to detect string features. It is of vital importance to improve the DPI performance due to the ever-increasing link speed. In this demo, we propose a novel DPI architecture with a hierarchy memory structure and parallel matching engines based on memory-centric FPGA. The implemented DPI prototype is able to provide up to 60Gbps full-text string matching throughput and fast rules update speed.
Large scale applications in data centers are composed of computers connected with a network. Traditional network switches cannot be flexibly controlled. Then, application developer cannot optimize network elements' behavior for improving application performance. On the other hand, Deeply Programmable Network (DPN) switches can completely analyze packet payloads and be profoundly programmed. In this paper, we focus on processing a part of application functions in network elements for improving application performance based on Deep Packet Inspection (DPI), i.e. analyzing packet payload, using DPN switches. We assume some applications as targets and implement some of functions of applications in network switches. We then present the comparison of performances with and without out method, and show that our method can significantly increase application performance.
Internet censorship is used in many parts of the world to prohibit free access to online information. Different techniques such as IP address or URL blocking, DNS hijacking, or deep packet inspection are used to block access to specific content on the Internet. In response, several censorship circumvention systems were proposed that attempt to bypass existing filters. Especially systems that hide the communication in different types of cover protocols attracted a lot of attention. However, recent research results suggest that this kind of covert traffic can be easily detected by censors. In this paper, we present SkypeLine, a censorship circumvention system that leverages Direct-Sequence Spread Spectrum (DSSS) based steganography to hide information in Voice-over-IP (VoIP) communication. SkypeLine introduces two novel modulation techniques that hide data by modulating information bits on the voice carrier signal using pseudo-random, orthogonal noise sequences and repeating the spreading operation several times. Our design goals focus on undetectability in presence of a strong adversary and improved data rates. As a result, the hiding is inconspicuous, does not alter the statistical characteristics of the carrier signal, and is robust against alterations of the transmitted packets. We demonstrate the performance of SkypeLine based on two simulation studies that cover the theoretical performance and robustness. Our measurements demonstrate that the data rates achieved with our techniques substantially exceed existing DSSS approaches. Furthermore, we prove the real-world applicability of the presented system with an exemplary prototype for Skype.
We propose a method for classifying Wi-Fi enabled mobile handheld devices (smartphones) and non-handheld devices (laptops) in a completely passive way, that is resorting neither to traffic probes on network edge devices nor to deep packet inspection techniques to read application layer information. Instead, classification is performed starting from probe requests Wi-Fi frames, which can be sniffed with inexpensive commercial hardware. We extract distinctive features from probe request frames (how many probe requests are transmitted by each device, how frequently, etc.) and take a machine learning approach, training four different classifiers to recognize the two types of devices. We compare the performance of the different classifiers and identify a solution based on a Random Decision Forest that correctly classify devices 95% of the times. The classification method is then used as a pre-processing stage to analyze network traffic traces from the wireless network of a university building, with interesting considerations on the way different types of devices uses the network (amount of data exchanged, duration of connections, etc.). The proposed methodology finds application in many scenarios related to Wi-Fi network management/optimization and Wi-Fi based services.
The usual approach to security for cloud-hosted applications is strong separation. However, it is often the case that the same data is used by different applications, particularly given the increase in data-driven (`big data' and IoT) applications. We argue that access control for the cloud should no longer be application-specific but should be data-centric, associated with the data that can flow between applications. Indeed, the data may originate outside cloud services from diverse sources such as medical monitoring, environmental sensing etc. Information Flow Control (IFC) potentially offers data-centric, system-wide data access control. It has been shown that IFC can be provided at operating system level as part of a PaaS offering, with an acceptable overhead. In this paper we consider how IFC can be integrated with application-specific access control, transparently from application developers, while building from simple IFC primitives, access control policies that align with the data management obligations of cloud providers and tenants.
In international military coalitions, situation awareness is achieved by gathering critical intel from different authorities. Authorities want to retain control over their data, as they are sensitive by nature, and, thus, usually employ their own authorization solutions to regulate access to them. In this paper, we highlight that harmonizing authorization solutions at the coalition level raises many challenges. We demonstrate how we address authorization challenges in the context of a scenario defined by military experts using a prototype implementation of SAFAX, an XACML-based architectural framework tailored to the development of authorization services for distributed systems.
Embedded systems are becoming increasingly complex as designers integrate different functionalities into a single application for execution on heterogeneous hardware platforms. In this work we propose a system-level security approach in order to provide isolation of tasks without the need to trust a central authority at run-time. We discuss security requirements that can be found in complex embedded systems that use heterogeneous execution platforms, and by regulating memory access we create mechanisms that allow safe use of shared IP with direct memory access, as well as shared libraries. We also present a prototype Isolation Unit that checks memory transactions and allows for dynamic configuration of permissions.
New hardware security threats are identified in emerging three-dimensional (3D) integrated circuits (ICs) and potential countermeasures are introduced. Trigger and payload mechanisms for future 3D hardware Trojans are predicted. Furthermore, a novel, network-on-chip based 3D obfuscation method is proposed to block the direct communication between two commercial dies in a 3D structure, thus thwarting reverse engineering attacks on the vertical dimension. Simulation results demonstrate that the proposed method effectively obfuscates the cross-plane communication by increasing the reverse engineering time by approximately 5x as compared to using direct through silicon via (TSV) connections. The proposed method consumes approximately one fifth the area and power of a typical network-on-chip designed in a 65 nm technology, exhibiting limited overhead.
As chip multiprocessors (CMPs) are becoming more susceptible to process variation, crosstalk, and hard and soft errors, emerging threats from rogue employees in a compromised foundry are creating new vulnerabilities that could undermine the integrity of our chips with malicious alterations. As the Network-on-Chip (NoC) is a focal point of sensitive data transfer and critical device coordination, there is an urgent demand for secure and reliable communication. In this paper we propose Secure Model Checkers (SMCs), a real-time solution for control logic verification and functional correctness in the micro-architecture to detect Hardware Trojan (HT) induced denial-of-service attacks and improve reliability. In our evaluation, we show that SMCs provides significant security enhancements in real-time with only 1.5% power and 1.1% area overhead penalty in the micro-architecture.
In the near future, billions of new smart devices will connect the big network of the Internet of Things, playing an important key role in our daily life. Allowing IPv6 on the low-power resource constrained devices will lead research to focus on novel approaches that aim to improve the efficiency, security and performance of the 6LoWPAN adaptation layer. This work in progress paper proposes a hardware-based Network Packet Filtering (NPF) and an IPv6 Link-local address calculator which is able to filter the received IPv6 packets, offering nearly 18% overhead reduction. The goal is to obtain a System-on-Chip implementation that can be deployed in future IEEE 802.15.4 radio modules.
Internet has been being becoming the most famous and biggest communication networks as social, industrial, and public infrastructure since Internet was invented at late 1960s. In a historical retrospect of Internet's evolution, the Internet architecture continues evolution repeatedly by going through various technical challenges, for instance, in early 1990s, Internet had encountered danger of scalability, after a short while it had been overcome and successfully evolved by applying emerging techniques such as CIDR, NAT, and IPv6. Especially this paper emphasizes scalability issues as technical challenges with forecasting that Internet of things era has come. Firstly, we describe the Identifier and locator separation scheme that can achieve dramatically architectural evolution in historical perspective. Additionally, it reviews various kinds of Identifier and locator separation scheme because recently the scheme can be the major design pillar towards future of Internet architecture such as both various clean-slated future Internet architectures and evolving Internet architectures. Lastly we show a result of analysis by analysis table for future of internet of everything where number of Internet connected devices will growth to more than 20 billion by 2020.

