Visible to the public Biblio

Filters: Keyword is string matching  [Clear All Filters]
2021-01-18
Sun, J., Ma, J., Quan, J., Zhu, X., I, C..  2019.  A Fuzzy String Matching Scheme Resistant to Statistical Attack. 2019 International Conference on Networking and Network Applications (NaNA). :396–402.
The fuzzy query scheme based on vector index uses Bloom filter to construct vector index for key words. Then the statistical attack based on the deviation of frequency distribution of the vector index brings out the sensitive information disclosure. Using the noise vector, a fuzzy query scheme resistant to the statistical attack serving for encrypted database, i.e. S-BF, is introduced. With the noise vector to clear up the deviation of frequency distribution of vector index, the statistical attacks to the vector index are resolved. Demonstrated by lab experiment, S-BF scheme can achieve the secure fuzzy query with the powerful privation protection capability for encrypted cloud database without the loss of fuzzy query efficiency.
2021-01-11
Khandait, P., Hubballi, N., Mazumdar, B..  2020.  Efficient Keyword Matching for Deep Packet Inspection based Network Traffic Classification. 2020 International Conference on COMmunication Systems NETworkS (COMSNETS). :567–570.
Network traffic classification has a range of applications in network management including QoS and security monitoring. Deep Packet Inspection (DPI) is one of the effective method used for traffic classification. DPI is computationally expensive operation involving string matching between payload and application signatures. Existing traffic classification techniques perform multiple scans of payload to classify the application flows - first scan to extract the words and the second scan to match the words with application signatures. In this paper we propose an approach which can classify network flows with single scan of flow payloads using a heuristic method to achieve a sub-linear search complexity. The idea is to scan few initial bytes of payload and determine potential application signature(s) for subsequent signature matching. We perform experiments with a large dataset containing 171873 network flows and show that it has a good classification accuracy of 98%.
2019-11-19
Bontupalli, Venkataramesh, Yakopcic, Chris, Hasan, Raqibul, Taha, Tarek M..  2018.  Efficient Memristor-Based Architecture for Intrusion Detection and High-Speed Packet Classification. J. Emerg. Technol. Comput. Syst.. 14:41:1-41:27.

Deep packet inspection (DPI) is a critical component to prevent intrusion detection. This requires a detailed analysis of each network packet header and body. Although this is often done on dedicated high-power servers in most networked systems, mobile systems could potentially be vulnerable to attack if utilized on an unprotected network. In this case, having DPI hardware on the mobile system would be highly beneficial. Unfortunately, DPI hardware is generally area and power consuming, making its implementation difficult in mobile systems. We developed a memristor crossbar-based approach, inspired by memristor crossbar neuromorphic circuits, for a low-power, low-area, and high-throughput DPI system that examines both the header and body of a packet. Two key types of circuits are presented: static pattern matching and regular expression circuits. This system is able to reduce execution time and power consumption due to its high-density grid and massive parallelism. Independent searches are performed using low-power memristor crossbar arrays giving rise to a throughput of 160Gbps with no loss in the classification accuracy.

2018-09-12
Domínguez, A., Carballo, P. P., Núñez, A..  2017.  Programmable SoC platform for deep packet inspection using enhanced Boyer-Moore algorithm. 2017 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC). :1–8.

This paper describes the work done to design a SoC platform for real-time on-line pattern search in TCP packets for Deep Packet Inspection (DPI) applications. The platform is based on a Xilinx Zynq programmable SoC and includes an accelerator that implements a pattern search engine that extends the original Boyer-Moore algorithm with timing and logical rules, that produces a very complex set of rules. Also, the platform implements different modes of operation, including SIMD and MISD parallelism, which can be configured on-line. The platform is scalable depending of the analysis requirement up to 8 Gbps. High-Level synthesis and platform based design methodologies have been used to reduce the time to market of the completed system.

2018-04-04
Ran, L., Lu, L., Lin, H., Han, M., Zhao, D., Xiang, J., Yu, H., Ma, X..  2017.  An Experimental Study of Four Methods for Homology Analysis of Firmware Vulnerability. 2017 International Conference on Dependable Systems and Their Applications (DSA). :42–50.

In the production process of embedded device, due to the frequent reuse of third-party libraries or development kits, there are large number of same vulnerabilities that appear in more than one firmware. Homology analysis is often used in detecting this kind of vulnerabilities caused by code reuse or third-party reuse and in the homology analysis, the widely used methods are mainly Binary difference analysis, Normalized compression distance, String feature matching and Fuzz hash. But when we use these methods for homology analysis, we found that the detection result is not ideal and there is a high false positive rate. Focusing on this problem, we analyzed the application scenarios of these four methods and their limitations by combining different methods and different types of files and the experiments show that the combination of methods and files have a better performance in homology analysis.

2017-08-18
Aljamea, Moudhi M., Iliopoulos, Costas S., Samiruzzaman, M..  2016.  Detection Of URL In Image Steganography. Proceedings of the International Conference on Internet of Things and Cloud Computing. :23:1–23:6.

Steganography is the science of hiding data within data. Either for the good purpose of secret communication or for the bad intention of leaking sensitive confidential data or embedding malicious code or URL. However, many different carrier file formats can be used to hide these data (network, audio, image..etc) but the most common steganography carrier is embedding secret data within images as it is considered to be the best and easiest way to hide all types of files (secret files) within an image using different formats (another image, text, video, virus, URL..etc). To the human eye, the changes in the image appearance with the hidden data can be imperceptible. In fact, images can be more than what we see with our eyes. Therefore, many solutions where proposed to help in detecting these hidden data but each solution have their own strong and weak points either by the limitation of resolving one type of image along with specific hiding technique and or most likely without extracting the hidden data. This paper intends to propose a novel detection approach that will concentrate on detecting any kind of hidden URL in all types of images and extract the hidden URL from the carrier image that used the LSB least significant bit hiding technique.

2017-05-16
Vakili, Shervin, Langlois, J.M. Pierre, Boughzala, Bochra, Savaria, Yvon.  2016.  Memory-Efficient String Matching for Intrusion Detection Systems Using a High-Precision Pattern Grouping Algorithm. Proceedings of the 2016 Symposium on Architectures for Networking and Communications Systems. :37–42.

The increasing complexity of cyber-attacks necessitates the design of more efficient hardware architectures for real-time Intrusion Detection Systems (IDSs). String matching is the main performance-demanding component of an IDS. An effective technique to design high-performance string matching engines is to partition the target set of strings into multiple subgroups and to use a parallel string matching hardware unit for each subgroup. This paper introduces a novel pattern grouping algorithm for heterogeneous bit-split string matching architectures. The proposed algorithm presents a reliable method to estimate the correlation between strings. The correlation factors are then used to find a preferred group for each string in a seed growing approach. Experimental results demonstrate that the proposed algorithm achieves an average of 41% reduction in memory consumption compared to the best existing approach found in the literature, while offering orders of magnitude faster execution time compared to an exhaustive search.

Lacroix, Alexsandre B., Langlois, J.M. Pierre, Boyer, François-Raymond, Gosselin, Antoine, Bois, Guy.  2016.  Node Configuration for the Aho-Corasick Algorithm in Intrusion Detection Systems. Proceedings of the 2016 Symposium on Architectures for Networking and Communications Systems. :121–122.

In this paper, we analyze the performance and cost trade-off from selecting two representations of nodes when implementing the Aho-Corasick algorithm. This algorithm can be used for pattern matching in network-based intrusion detection systems such as Snort. Our analysis uses the Snort 2.9.7 rules set, which contains almost 26k patterns. Our methodology consists of code profiling and analysis, followed by the selection of a parameter to maximize a metric that combines clock cycles count and memory usage. The parameter determines which of two types of nodes is selected for each trie node. We show that it is possible to select the parameter to optimize the metric, which results in an improvement by up to 12× compared with the single node-type case.

Su, Jinshu, Chen, Shuhui, Han, Biao, Xu, Chengcheng, Wang, Xin.  2016.  A 60Gbps DPI Prototype Based on Memory-Centric FPGA. Proceedings of the 2016 ACM SIGCOMM Conference. :627–628.

Deep packet inspection (DPI) is widely used in content-aware network applications to detect string features. It is of vital importance to improve the DPI performance due to the ever-increasing link speed. In this demo, we propose a novel DPI architecture with a hierarchy memory structure and parallel matching engines based on memory-centric FPGA. The implemented DPI prototype is able to provide up to 60Gbps full-text string matching throughput and fast rules update speed.

2015-05-01
Yanbing Liu, Qingyun Liu, Ping Liu, Jianlong Tan, Li Guo.  2014.  A factor-searching-based multiple string matching algorithm for intrusion detection. Communications (ICC), 2014 IEEE International Conference on. :653-658.

Multiple string matching plays a fundamental role in network intrusion detection systems. Automata-based multiple string matching algorithms like AC, SBDM and SBOM are widely used in practice, but the huge memory usage of automata prevents them from being applied to a large-scale pattern set. Meanwhile, poor cache locality of huge automata degrades the matching speed of algorithms. Here we propose a space-efficient multiple string matching algorithm BVM, which makes use of bit-vector and succinct hash table to replace the automata used in factor-searching-based algorithms. Space complexity of the proposed algorithm is O(rm2 + ΣpϵP |p|), that is more space-efficient than the classic automata-based algorithms. Experiments on datasets including Snort, ClamAV, URL blacklist and synthetic rules show that the proposed algorithm significantly reduces memory usage and still runs at a fast matching speed. Above all, BVM costs less than 0.75% of the memory usage of AC, and is capable of matching millions of patterns efficiently.

2015-04-30
Yanbing Liu, Qingyun Liu, Ping Liu, Jianlong Tan, Li Guo.  2014.  A factor-searching-based multiple string matching algorithm for intrusion detection. Communications (ICC), 2014 IEEE International Conference on. :653-658.

Multiple string matching plays a fundamental role in network intrusion detection systems. Automata-based multiple string matching algorithms like AC, SBDM and SBOM are widely used in practice, but the huge memory usage of automata prevents them from being applied to a large-scale pattern set. Meanwhile, poor cache locality of huge automata degrades the matching speed of algorithms. Here we propose a space-efficient multiple string matching algorithm BVM, which makes use of bit-vector and succinct hash table to replace the automata used in factor-searching-based algorithms. Space complexity of the proposed algorithm is O(rm2 + ΣpϵP |p|), that is more space-efficient than the classic automata-based algorithms. Experiments on datasets including Snort, ClamAV, URL blacklist and synthetic rules show that the proposed algorithm significantly reduces memory usage and still runs at a fast matching speed. Above all, BVM costs less than 0.75% of the memory usage of AC, and is capable of matching millions of patterns efficiently.