Visible to the public Biblio

Filters: Keyword is hash table  [Clear All Filters]
2022-06-09
Palit, Shekhar, Wortman, Kevin A..  2021.  Perfect Tabular Hashing in Pseudolinear Time. 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC). :0228–0232.
We present an algorithm for generating perfect tabulation hashing functions by reduction to Boolean satisfaction (SAT). Tabulation hashing is a high-performance family of hash functions for hash tables that involves computing the XOR of random lookup tables. Given n keys of word size W, we show how to compute a perfect hash function in O(nW) worst-case time. This is competitive with other perfect hashing methods, and the resultant hash functions are simple and performant.
2021-03-29
Mar, Z., Oo, K. K..  2020.  An Improvement of Apriori Mining Algorithm using Linked List Based Hash Table. 2020 International Conference on Advanced Information Technologies (ICAIT). :165–169.
Today, the huge amount of data was using in organizations around the world. This huge amount of data needs to process so that we can acquire useful information. Consequently, a number of industry enterprises discovered great information from shopper purchases found in any respect times. In data mining, the most important algorithms for find frequent item sets from large database is Apriori algorithm and discover the knowledge using the association rule. Apriori algorithm was wasted times for scanning the whole database and searching the frequent item sets and inefficient of memory requirement when large numbers of transactions are in consideration. The improved Apriori algorithm is adding and calculating third threshold may increase the overhead. So, in the aims of proposed research, Improved Apriori algorithm with LinkedList and hash tabled is used to mine frequent item sets from the transaction large amount of database. This method includes database is scanning with Improved Apriori algorithm and frequent 1-item sets counts with using the hash table. Then, in the linked list saved the next frequent item sets and scanning the database. The hash table used to produce the frequent 2-item sets Therefore, the database scans the only two times and necessary less processing time and memory space.
2021-01-11
Awad, M. A., Ashkiani, S., Porumbescu, S. D., Owens, J. D..  2020.  Dynamic Graphs on the GPU. 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). :739–748.
We present a fast dynamic graph data structure for the GPU. Our dynamic graph structure uses one hash table per vertex to store adjacency lists and achieves 3.4-14.8x faster insertion rates over the state of the art across a diverse set of large datasets, as well as deletion speedups up to 7.8x. The data structure supports queries and dynamic updates through both edge and vertex insertion and deletion. In addition, we define a comprehensive evaluation strategy based on operations, workloads, and applications that we believe better characterize and evaluate dynamic graph data structures.
2020-07-24
Wu, Zhijun, Xu, Enzhong, Liu, Liang, Yue, Meng.  2019.  CHTDS: A CP-ABE Access Control Scheme Based on Hash Table and Data Segmentation in NDN. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :843—848.

For future Internet, information-centric networking (ICN) is considered a potential solution to many of its current problems, such as content distribution, mobility, and security. Named Data Networking (NDN) is a more popular ICN project. However, concern regarding the protection of user data persists. Information caching in NDN decouples content and content publishers, which leads to content security threats due to lack of secure controls. Therefore, this paper presents a CP-ABE (ciphertext policy attribute based encryption) access control scheme based on hash table and data segmentation (CHTDS). Based on data segmentation, CHTDS uses a method of linearly splitting fixed data blocks, which effectively improves data management. CHTDS also introduces CP-ABE mechanism and hash table data structure to ensure secure access control and privilege revocation does not need to re-encrypt the published content. The analysis results show that CHTDS can effectively realize the security and fine-grained access control in the NDN environment, and reduce communication overhead for content access.

2020-06-12
Domniţa, Dan, Oprişa, Ciprian.  2018.  A genetic algorithm for obtaining memory constrained near-perfect hashing. 2018 IEEE International Conference on Automation, Quality and Testing, Robotics (AQTR). :1—6.

The problem of fast items retrieval from a fixed collection is often encountered in most computer science areas, from operating system components to databases and user interfaces. We present an approach based on hash tables that focuses on both minimizing the number of comparisons performed during the search and minimizing the total collection size. The standard open-addressing double-hashing approach is improved with a non-linear transformation that can be parametrized in order to ensure a uniform distribution of the data in the hash table. The optimal parameter is determined using a genetic algorithm. The paper results show that near-perfect hashing is faster than binary search, yet uses less memory than perfect hashing, being a good choice for memory-constrained applications where search time is also critical.

2020-06-08
Fang, Bo, Hua, Zhongyun, Huang, Hejiao.  2019.  Locality-Sensitive Hashing Scheme Based on Heap Sort of Hash Bucket. 2019 14th International Conference on Computer Science Education (ICCSE). :5–10.
Nearest neighbor search (NNS) is one of the current popular research directions, which widely used in machine learning, pattern recognition, image detection and so on. In the low dimension data, based on tree search method can get good results. But when the data dimension goes up, that will produce a curse of dimensional. The proposed Locality-Sensitive Hashing algorithm (LSH) greatly improves the efficiency of nearest neighbor query for high dimensional data. But the algorithm relies on the building a large number of hash table, which makes the space complexity very high. C2LSH based on dynamic collision improves the disadvantage of LSH, but its disadvantage is that it needs to detect the collision times of a large number of data points which Increased query time. Therefore, Based on LSH algorithm, later researchers put forward many improved algorithms, but still not ideal.In this paper, we put forward Locality-Sensitive Hashing Scheme Based on Heap Sort of Hash Bucket (HSLSH) algorithm aiming at the shortcomings of LSH and C2LSH. Its main idea is to take advantage of the efficiency of heapsort in massive data sorting to improve the efficiency of nearest neighbor query. It only needs to rely on a small number of hash functions can not only overcome the shortcoming of LSH need to build a large number of hash table, and avoids defects of C2LSH. Experiments show that our algorithm is more than 20% better than C2LSH in query accuracy and 40% percent lower in query time.
Huang, Jiamin, Lu, Yueming, Guo, Kun.  2019.  A Hybrid Packet Classification Algorithm Based on Hash Table and Geometric Space Partition. 2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC). :587–592.
The emergence of integrated space-ground network (ISGN), with more complex network conditions compared with tradition network, requires packet classification to achieve high performance. Packet classification plays an important role in the field of network security. Although several existing classification schemes have been proposed recently to improve classification performance, the performance of these schemes is unable to meet the high-speed packet classification requirement in ISGN. To tackle this problem, a hybrid packet classification algorithm based on hash table and geometric space partition (HGSP) is proposed in this paper. HGSP falls into two sections: geometric space partition and hash matching. To improve the classification speed under the same accuracy, a parallel structure of hash table is designed to match the huge packets for classifying. The experimental results demonstrate that the matching time of HGSP algorithm is reduced by 40%-70% compared with traditional Hicuts algorithm. Particularly, with the growth of ruleset, the advantage of HGSP algorithm will become more obvious.
2015-05-06
Pi-Chung Wang.  2014.  Scalable Packet Classification for Datacenter Networks. Selected Areas in Communications, IEEE Journal on. 32:124-137.

The key challenge to a datacenter network is its scalability to handle many customers and their applications. In a datacenter network, packet classification plays an important role in supporting various network services. Previous algorithms store classification rules with the same length combinations in a hash table to simplify the search procedure. The search performance of hash-based algorithms is tied to the number of hash tables. To achieve fast and scalable packet classification, we propose an algorithm, encoded rule expansion, to transform rules into an equivalent set of rules with fewer distinct length combinations, without affecting the classification results. The new algorithm can minimize the storage penalty of transformation and achieve a short search time. In addition, the scheme supports fast incremental updates. Our simulation results show that more than 90% hash tables can be eliminated. The reduction of length combinations leads to an improvement on speed performance of packet classification by an order of magnitude. The results also show that the software implementation of our scheme without using any hardware parallelism can support up to one thousand customer VLANs and one million rules, where each rule consumes less than 60 bytes and each packet classification can be accomplished under 50 memory accesses.
 

Yang Xu, Zhaobo Liu, Zhuoyuan Zhang, Chao, H.J..  2014.  High-Throughput and Memory-Efficient Multimatch Packet Classification Based on Distributed and Pipelined Hash Tables. Networking, IEEE/ACM Transactions on. 22:982-995.

The emergence of new network applications, such as the network intrusion detection system and packet-level accounting, requires packet classification to report all matched rules instead of only the best matched rule. Although several schemes have been proposed recently to address the multimatch packet classification problem, most of them require either huge memory or expensive ternary content addressable memory (TCAM) to store the intermediate data structure, or they suffer from steep performance degradation under certain types of classifiers. In this paper, we decompose the operation of multimatch packet classification from the complicated multidimensional search to several single-dimensional searches, and present an asynchronous pipeline architecture based on a signature tree structure to combine the intermediate results returned from single-dimensional searches. By spreading edges of the signature tree across multiple hash tables at different stages, the pipeline can achieve a high throughput via the interstage parallel access to hash tables. To exploit further intrastage parallelism, two edge-grouping algorithms are designed to evenly divide the edges associated with each stage into multiple work-conserving hash tables. To avoid collisions involved in hash table lookup, a hybrid perfect hash table construction scheme is proposed. Extensive simulation using realistic classifiers and traffic traces shows that the proposed pipeline architecture outperforms HyperCuts and B2PC schemes in classification speed by at least one order of magnitude, while having a similar storage requirement. Particularly, with different types of classifiers of 4K rules, the proposed pipeline architecture is able to achieve a throughput between 26.8 and 93.1 Gb/s using perfect hash tables.

2015-05-05
Koyanagi, T., Shinjo, Y..  2014.  A fast and compact hybrid memory resident datastore for text analytics with autonomic memory allocation. Information and Communication Systems (ICICS), 2014 5th International Conference on. :1-7.

This paper describes a high-performance and space-efficient memory-resident datastore for text analytics systems based on a hash table for fast access, a dynamic trie for staging and a list of Level-Order Unary Degree Sequence (LOUDS) tries for compactness. We achieve efficient memory allocation and data placement by placing freqently access keys in the hash table, and infrequently accessed keys in the LOUDS tries without using conventional cache algorithms. Our algorithm also dynamically changes memory allocation sizes for these data structures according to the remaining available memory size. This technique yields 38.6% to 52.9% better throughput than a double array trie - a conventional fast and compact datastore.