Biblio
Deep packet inspection (DPI) is a critical component to prevent intrusion detection. This requires a detailed analysis of each network packet header and body. Although this is often done on dedicated high-power servers in most networked systems, mobile systems could potentially be vulnerable to attack if utilized on an unprotected network. In this case, having DPI hardware on the mobile system would be highly beneficial. Unfortunately, DPI hardware is generally area and power consuming, making its implementation difficult in mobile systems. We developed a memristor crossbar-based approach, inspired by memristor crossbar neuromorphic circuits, for a low-power, low-area, and high-throughput DPI system that examines both the header and body of a packet. Two key types of circuits are presented: static pattern matching and regular expression circuits. This system is able to reduce execution time and power consumption due to its high-density grid and massive parallelism. Independent searches are performed using low-power memristor crossbar arrays giving rise to a throughput of 160Gbps with no loss in the classification accuracy.
This paper describes the work done to design a SoC platform for real-time on-line pattern search in TCP packets for Deep Packet Inspection (DPI) applications. The platform is based on a Xilinx Zynq programmable SoC and includes an accelerator that implements a pattern search engine that extends the original Boyer-Moore algorithm with timing and logical rules, that produces a very complex set of rules. Also, the platform implements different modes of operation, including SIMD and MISD parallelism, which can be configured on-line. The platform is scalable depending of the analysis requirement up to 8 Gbps. High-Level synthesis and platform based design methodologies have been used to reduce the time to market of the completed system.
In the production process of embedded device, due to the frequent reuse of third-party libraries or development kits, there are large number of same vulnerabilities that appear in more than one firmware. Homology analysis is often used in detecting this kind of vulnerabilities caused by code reuse or third-party reuse and in the homology analysis, the widely used methods are mainly Binary difference analysis, Normalized compression distance, String feature matching and Fuzz hash. But when we use these methods for homology analysis, we found that the detection result is not ideal and there is a high false positive rate. Focusing on this problem, we analyzed the application scenarios of these four methods and their limitations by combining different methods and different types of files and the experiments show that the combination of methods and files have a better performance in homology analysis.
Steganography is the science of hiding data within data. Either for the good purpose of secret communication or for the bad intention of leaking sensitive confidential data or embedding malicious code or URL. However, many different carrier file formats can be used to hide these data (network, audio, image..etc) but the most common steganography carrier is embedding secret data within images as it is considered to be the best and easiest way to hide all types of files (secret files) within an image using different formats (another image, text, video, virus, URL..etc). To the human eye, the changes in the image appearance with the hidden data can be imperceptible. In fact, images can be more than what we see with our eyes. Therefore, many solutions where proposed to help in detecting these hidden data but each solution have their own strong and weak points either by the limitation of resolving one type of image along with specific hiding technique and or most likely without extracting the hidden data. This paper intends to propose a novel detection approach that will concentrate on detecting any kind of hidden URL in all types of images and extract the hidden URL from the carrier image that used the LSB least significant bit hiding technique.
The increasing complexity of cyber-attacks necessitates the design of more efficient hardware architectures for real-time Intrusion Detection Systems (IDSs). String matching is the main performance-demanding component of an IDS. An effective technique to design high-performance string matching engines is to partition the target set of strings into multiple subgroups and to use a parallel string matching hardware unit for each subgroup. This paper introduces a novel pattern grouping algorithm for heterogeneous bit-split string matching architectures. The proposed algorithm presents a reliable method to estimate the correlation between strings. The correlation factors are then used to find a preferred group for each string in a seed growing approach. Experimental results demonstrate that the proposed algorithm achieves an average of 41% reduction in memory consumption compared to the best existing approach found in the literature, while offering orders of magnitude faster execution time compared to an exhaustive search.
In this paper, we analyze the performance and cost trade-off from selecting two representations of nodes when implementing the Aho-Corasick algorithm. This algorithm can be used for pattern matching in network-based intrusion detection systems such as Snort. Our analysis uses the Snort 2.9.7 rules set, which contains almost 26k patterns. Our methodology consists of code profiling and analysis, followed by the selection of a parameter to maximize a metric that combines clock cycles count and memory usage. The parameter determines which of two types of nodes is selected for each trie node. We show that it is possible to select the parameter to optimize the metric, which results in an improvement by up to 12× compared with the single node-type case.
Deep packet inspection (DPI) is widely used in content-aware network applications to detect string features. It is of vital importance to improve the DPI performance due to the ever-increasing link speed. In this demo, we propose a novel DPI architecture with a hierarchy memory structure and parallel matching engines based on memory-centric FPGA. The implemented DPI prototype is able to provide up to 60Gbps full-text string matching throughput and fast rules update speed.
Multiple string matching plays a fundamental role in network intrusion detection systems. Automata-based multiple string matching algorithms like AC, SBDM and SBOM are widely used in practice, but the huge memory usage of automata prevents them from being applied to a large-scale pattern set. Meanwhile, poor cache locality of huge automata degrades the matching speed of algorithms. Here we propose a space-efficient multiple string matching algorithm BVM, which makes use of bit-vector and succinct hash table to replace the automata used in factor-searching-based algorithms. Space complexity of the proposed algorithm is O(rm2 + ΣpϵP |p|), that is more space-efficient than the classic automata-based algorithms. Experiments on datasets including Snort, ClamAV, URL blacklist and synthetic rules show that the proposed algorithm significantly reduces memory usage and still runs at a fast matching speed. Above all, BVM costs less than 0.75% of the memory usage of AC, and is capable of matching millions of patterns efficiently.
Multiple string matching plays a fundamental role in network intrusion detection systems. Automata-based multiple string matching algorithms like AC, SBDM and SBOM are widely used in practice, but the huge memory usage of automata prevents them from being applied to a large-scale pattern set. Meanwhile, poor cache locality of huge automata degrades the matching speed of algorithms. Here we propose a space-efficient multiple string matching algorithm BVM, which makes use of bit-vector and succinct hash table to replace the automata used in factor-searching-based algorithms. Space complexity of the proposed algorithm is O(rm2 + ΣpϵP |p|), that is more space-efficient than the classic automata-based algorithms. Experiments on datasets including Snort, ClamAV, URL blacklist and synthetic rules show that the proposed algorithm significantly reduces memory usage and still runs at a fast matching speed. Above all, BVM costs less than 0.75% of the memory usage of AC, and is capable of matching millions of patterns efficiently.