Visible to the public Biblio

Filters: Keyword is tree data structures  [Clear All Filters]
2022-07-12
Khanzadi, Pouria, Kordnoori, Shirin, Vasigh, Zahra, Mostafaei, Hamidreza, Akhtarkavan, Ehsan.  2021.  A Cyber Physical System based Stochastic Process Language With NuSMV Model Checker. 2021 International Conference on Intelligent Technology, System and Service for Internet of Everything (ITSS-IoE). :1—8.
Nowadays, cyber physical systems are playing an important role in human life in which they provide features that make interactions between human and machine easier. To design and analysis such systems, the main problem is their complexity. In this paper, we propose a description language for cyber physical systems based on stochastic processes. The proposed language is called SPDL (Stochastic Description Process Language). For designing SPDL, two main parts are considered for Cyber Physical Systems (CSP): embedded systems and physical environment. Then these parts are defined as stochastic processes and CPS is defined as a tuple. Syntax and semantics of SPDL are stated based on the proposed definition. Also, the semantics are defined as by set theory. For implementation of SPDL, dependencies between words of a requirements are extracted as a tree data structure. Based on the dependencies, SPDL is used for describing the CPS. Also, a lexical analyzer and a parser based on a defined BNF grammar for SPDL is designed and implemented. Finally, SPDL of CPS is transformed to NuSMV which is a symbolic model checker. The Experimental results show that SPDL is capable of describing cyber physical systems by natural language.
2020-12-11
Zhang, W., Byna, S., Niu, C., Chen, Y..  2019.  Exploring Metadata Search Essentials for Scientific Data Management. 2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC). :83—92.

Scientific experiments and observations store massive amounts of data in various scientific file formats. Metadata, which describes the characteristics of the data, is commonly used to sift through massive datasets in order to locate data of interest to scientists. Several indexing data structures (such as hash tables, trie, self-balancing search trees, sparse array, etc.) have been developed as part of efforts to provide an efficient method for locating target data. However, efficient determination of an indexing data structure remains unclear in the context of scientific data management, due to the lack of investigation on metadata, metadata queries, and corresponding data structures. In this study, we perform a systematic study of the metadata search essentials in the context of scientific data management. We study a real-world astronomy observation dataset and explore the characteristics of the metadata in the dataset. We also study possible metadata queries based on the discovery of the metadata characteristics and evaluate different data structures for various types of metadata attributes. Our evaluation on real-world dataset suggests that trie is a suitable data structure when prefix/suffix query is required, otherwise hash table should be used. We conclude our study with a summary of our findings. These findings provide a guideline and offers insights in developing metadata indexing methodologies for scientific applications.

2020-08-14
Zolfaghari, Majid, Salimi, Solmaz, Kharrazi, Mehdi.  2019.  Inferring API Correct Usage Rules: A Tree-based Approach. 2019 16th International ISC (Iranian Society of Cryptology) Conference on Information Security and Cryptology (ISCISC). :78—84.
The lack of knowledge about API correct usage rules is one of the main reasons that APIs are employed incorrectly by programmers, which in some cases lead to serious security vulnerabilities. However, finding a correct usage rule for an API is a time-consuming and error-prone task, particularly in the absence of an API documentation. Existing approaches to extract correct usage rules are mostly based on majority API usages, assuming the correct usage is prevalent. Although statistically extracting API correct usage rules achieves reasonable accuracy, it cannot work correctly in the absence of a fair amount of sample usages. We propose inferring API correct usage rules independent of the number of sample usages by leveraging an API tree structure. In an API tree, each node is an API, and each node's children are APIs called by the parent API. Starting from lower-level APIs, it is possible to infer the correct usage rules for them by utilizing the available correct usage rules of their children. We developed a tool based on our idea for inferring API correct usages rules hierarchically, and have applied it to the source code of Linux kernel v4.3 drivers and found 24 previously reported bugs.
2020-08-13
Cheng, Chen, Xiaoli, Liu, Linfeng, Wei, Longxin, Lin, Xiaofeng, Wu.  2019.  Algorithm for k-anonymity based on ball-tree and projection area density partition. 2019 14th International Conference on Computer Science Education (ICCSE). :972—975.

K-anonymity is a popular model used in microdata publishing to protect individual privacy. This paper introduces the idea of ball tree and projection area density partition into k-anonymity algorithm.The traditional kd-tree implements the division by forming a super-rectangular, but the super-rectangular has the area angle, so it cannot guarantee that the records on the corner are most similar to the records in this area. In this paper, the super-sphere formed by the ball-tree is used to address this problem. We adopt projection area density partition to increase the density of the resulting recorded points. We implement our algorithm with the Gotrack dataset and the Adult dataset in UCI. The experimentation shows that the k-anonymity algorithm based on ball-tree and projection area density partition, obtains more anonymous groups, and the generalization rate is lower. The smaller the K is, the more obvious the result advantage is. The result indicates that our algorithm can make data usability even higher.

2020-06-08
Huang, Jiamin, Lu, Yueming, Guo, Kun.  2019.  A Hybrid Packet Classification Algorithm Based on Hash Table and Geometric Space Partition. 2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC). :587–592.
The emergence of integrated space-ground network (ISGN), with more complex network conditions compared with tradition network, requires packet classification to achieve high performance. Packet classification plays an important role in the field of network security. Although several existing classification schemes have been proposed recently to improve classification performance, the performance of these schemes is unable to meet the high-speed packet classification requirement in ISGN. To tackle this problem, a hybrid packet classification algorithm based on hash table and geometric space partition (HGSP) is proposed in this paper. HGSP falls into two sections: geometric space partition and hash matching. To improve the classification speed under the same accuracy, a parallel structure of hash table is designed to match the huge packets for classifying. The experimental results demonstrate that the matching time of HGSP algorithm is reduced by 40%-70% compared with traditional Hicuts algorithm. Particularly, with the growth of ruleset, the advantage of HGSP algorithm will become more obvious.
2020-05-22
Wang, Xi, Yao, Jun, Ji, Hongxia, Zhang, Ze, Li, Chen, Ma, Beizhi.  2018.  A Local Integral Hash Nearest Neighbor Algorithm. 2018 3rd International Conference on Mechanical, Control and Computer Engineering (ICMCCE). :544—548.

Nearest neighbor search algorithm plays a very important role in computer image algorithm. When the search data is large, we need to use fast search algorithm. The current fast retrieval algorithms are tree based algorithms. The efficiency of the tree algorithm decreases sharply with the increase of the data dimension. In this paper, a local integral hash nearest neighbor algorithm of the spatial space is proposed to construct the tree structure by changing the way of the node of the access tree. It is able to express data distribution characteristics. After experimental testing, this paper achieves more efficient performance in high dimensional data.

2020-04-03
Luo, Xueting, Lu, Yueming.  2019.  A Method of Conflict Detection for Security Policy Based on B+ Tree. 2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC). :466-472.

Security policy is widely used in network management systems to ensure network security. It is necessary to detect and resolve conflicts in security policies. This paper analyzes the shortcomings of existing security policy conflict detection methods and proposes a B+ tree-based security policy conflict detection method. First, the security policy is dimensioned to make each attribute corresponds to one dimension. Then, a layer of B+ tree index is constructed at each dimension level. Each rule will be uniquely mapped by multiple layers of nested indexes. This method can greatly improve the efficiency of conflict detection. The experimental results show that the method has very stable performance which can effectively prevent conflicts, the type of policy conflict can be detected quickly and accurately.

2020-02-24
Jiang, Jehn-Ruey, Chung, Wei-Sheng.  2019.  Real-Time Proof of Violation with Adaptive Huffman Coding Hash Tree for Cloud Storage Service. 2019 IEEE 12th Conference on Service-Oriented Computing and Applications (SOCA). :147–153.
This paper proposes two adaptive Huffman coding hash tree algorithms to construct the hash tree of a file system. The algorithms are used to design the real-time proof of violation (PoV) scheme for the cloud storage service to achieve mutual non-repudiation between the user and the service provider. The PoV scheme can then generate cryptographic proofs once the service-level agreement (SLA) is violated. Based on adaptive Huffman coding, the proposed algorithms add hash tree nodes dynamically when a file is accessed for the first time. Every node keeps a count to reflect the frequency of occurrence of the associated file, and all nodes' counts and the tree structure are adjusted on-the-fly for every file access. This can significantly reduce the memory and computation overheads required by the PoV scheme. The file access patterns of the NCUCCWiki and the SNIA IOTTA datasets are used to evaluate the performance of the proposed algorithms. The algorithms are also compared with a related hash tree construction algorithm used in a PoV scheme, named ERA, to show their superiority in performance.
2020-02-18
Zhang, Detian, Liu, An, Jin, Gaoming, Li, Qing.  2019.  Edge-Based Shortest Path Caching for Location-Based Services. 2019 IEEE International Conference on Web Services (ICWS). :320–327.

Shortest path queries on road networks are widely used in location-based services (LBS), e.g., finding the shortest route from my home to the airport through Google Maps. However, when there are a large number of path queries arrived concurrently or in a short while, an LBS provider (e.g., Google Maps) has to endure a high workload and then may lead to a long response time to users. Therefore, path caching services are utilized to accelerate large-scale path query processing, which try to store the historical path results and reuse them to answer the coming queries directly. However, most of existing path caches are organized based on nodes of paths; hence, the underlying road network topology is still needed to answer a path query when its querying origin or destination lies on edges. To overcome this limitation, we propose an edge-based shortest path cache in this paper that can efficiently handle queries without needing any road information, which is much more practical in the real world. We achieve this by designing a totally new edge-based path cache structure, an efficient R-tree-based cache lookup algorithm, and a greedy-based cache construction algorithm. Extensive experiments on a real road network and real point-of-interest datasets are conducted, and the results show the efficiency, scalability, and applicability of our proposed caching techniques.

2019-11-25
Wu, Songrui, Li, Qi, Li, Guoliang, Yuan, Dong, Yuan, Xingliang, Wang, Cong.  2019.  ServeDB: Secure, Verifiable, and Efficient Range Queries on Outsourced Database. 2019 IEEE 35th International Conference on Data Engineering (ICDE). :626–637.

Data outsourcing to cloud has been a common IT practice nowadays due to its significant benefits. Meanwhile, security and privacy concerns are critical obstacles to hinder the further adoption of cloud. Although data encryption can mitigate the problem, it reduces the functionality of query processing, e.g., disabling SQL queries. Several schemes have been proposed to enable one-dimensional query on encrypted data, but multi-dimensional range query has not been well addressed. In this paper, we propose a secure and scalable scheme that can support multi-dimensional range queries over encrypted data. The proposed scheme has three salient features: (1) Privacy: the server cannot learn the contents of queries and data records during query processing. (2) Efficiency: we utilize hierarchical cubes to encode multi-dimensional data records and construct a secure tree index on top of such encoding to achieve sublinear query time. (3) Verifiability: our scheme allows users to verify the correctness and completeness of the query results to address server's malicious behaviors. We perform formal security analysis and comprehensive experimental evaluations. The results on real datasets demonstrate that our scheme achieves practical performance while guaranteeing data privacy and result integrity.

2019-03-22
Quweider, M., Lei, H., Zhang, L., Khan, F..  2018.  Managing Big Data in Visual Retrieval Systems for DHS Applications: Combining Fourier Descriptors and Metric Space Indexing. 2018 1st International Conference on Data Intelligence and Security (ICDIS). :188-193.

Image retrieval systems have been an active area of research for more than thirty years progressively producing improved algorithms that improve performance metrics, operate in different domains, take advantage of different features extracted from the images to be retrieved, and have different desirable invariance properties. With the ever-growing visual databases of images and videos produced by a myriad of devices comes the challenge of selecting effective features and performing fast retrieval on such databases. In this paper, we incorporate Fourier descriptors (FD) along with a metric-based balanced indexing tree as a viable solution to DHS (Department of Homeland Security) needs to for quick identification and retrieval of weapon images. The FDs allow a simple but effective outline feature representation of an object, while the M-tree provide a dynamic, fast, and balanced search over such features. Motivated by looking for applications of interest to DHS, we have created a basic guns and rifles databases that can be used to identify weapons in images and videos extracted from media sources. Our simulations show excellent performance in both representation and fast retrieval speed.

2018-08-23
Ning, F., Wen, Y., Shi, G., Meng, D..  2017.  Efficient tamper-evident logging of distributed systems via concurrent authenticated tree. 2017 IEEE 36th International Performance Computing and Communications Conference (IPCCC). :1–9.
Secure logging as an indispensable part of any secure system in practice is well-understood by both academia and industry. However, providing security for audit logs on an untrusted machine in a large distributed system is still a challenging task. The emergence and wide availability of log management tools prompted plenty of work in the security community that allows clients or auditors to verify integrity of the log data. Most recent solutions to this problem focus on the space-efficiency or public verifiability of forward security. Unfortunately, existing secure audit logging schemes have significant performance limitations that make them impractical for realtime large-scale distributed applications: Existing cryptographic hashing is computationally expensive for logging in task intensive or resource-constrained systems especially to prove individual log events, while Merkle-tree approach has fundamental limitations when face with highly concurrent, large-scale log streams due to its serially appending feature. The verification step of Merkle-tree based approach requiring a logarithmic number of hash computations is becoming a bottleneck to improve the overall performance. There is a huge gap between the flux of log streams collected and the computational efficiency of integrity verification in the large-scale distributed systems. In this work, we develop a novel scheme, performance of which favorably compares with the existing solutions. The performance guarantees that we achieve stem from a novel data structure called concurrent authenticated tree, which allows log events concurrently appending and removes the need to wait for append operations to complete sequentially. We implement a prototype using chameleon hashing based on discrete log and Merkle history tree. A comprehensive experimental evaluation of the proposed and existing approaches is used to validate the analytical models and verify our claims. The results demonstrate that our proposed scheme verifying in a concurrent way is significantly more efficient than the previous tree-based approach.
2018-01-16
Gurjar, S. P. S., Pasupuleti, S. K..  2016.  A privacy-preserving multi-keyword ranked search scheme over encrypted cloud data using MIR-tree. 2016 International Conference on Computing, Analytics and Security Trends (CAST). :533–538.

With increasing popularity of cloud computing, the data owners are motivated to outsource their sensitive data to cloud servers for flexibility and reduced cost in data management. However, privacy is a big concern for outsourcing data to the cloud. The data owners typically encrypt documents before outsourcing for privacy-preserving. As the volume of data is increasing at a dramatic rate, it is essential to develop an efficient and reliable ciphertext search techniques, so that data owners can easily access and update cloud data. In this paper, we propose a privacy preserving multi-keyword ranked search scheme over encrypted data in cloud along with data integrity using a new authenticated data structure MIR-tree. The MIR-tree based index with including the combination of widely used vector space model and TF×IDF model in the index construction and query generation. We use inverted file index for storing word-digest, which provides efficient and fast relevance between the query and cloud data. Design an authentication set(AS) for authenticating the queries, for verifying top-k search results. Because of tree based index, our scheme achieves optimal search efficiency and reduces communication overhead for verifying the search results. The analysis shows security and efficiency of our scheme.

2017-12-12
Legg, P. A., Buckley, O., Goldsmith, M., Creese, S..  2017.  Automated Insider Threat Detection System Using User and Role-Based Profile Assessment. IEEE Systems Journal. 11:503–512.

Organizations are experiencing an ever-growing concern of how to identify and defend against insider threats. Those who have authorized access to sensitive organizational data are placed in a position of power that could well be abused and could cause significant damage to an organization. This could range from financial theft and intellectual property theft to the destruction of property and business reputation. Traditional intrusion detection systems are neither designed nor capable of identifying those who act maliciously within an organization. In this paper, we describe an automated system that is capable of detecting insider threats within an organization. We define a tree-structure profiling approach that incorporates the details of activities conducted by each user and each job role and then use this to obtain a consistent representation of features that provide a rich description of the user's behavior. Deviation can be assessed based on the amount of variance that each user exhibits across multiple attributes, compared against their peers. We have performed experimentation using ten synthetic data-driven scenarios and found that the system can identify anomalous behavior that may be indicative of a potential threat. We also show how our detection system can be combined with visual analytics tools to support further investigation by an analyst.

2017-02-23
K. Xiangying, C. Yanhui.  2015.  "Dynamic Remote Attestation Based on Concerns". 2015 8th International Symposium on Computational Intelligence and Design (ISCID). 1:76-80.

Based on the analysis relationships of challenger and attestation in remote attestation process, we propose a dynamic remote attestation model based on concerns. By combines the trusted root and application of dynamic credible monitoring module, Convert the Measurement for all load module of integrity measurement architecture into the Attestation of the basic computing environments, dynamic credible monitoring module, and request service software module. Discuss the rationality of the model. The model used Merkel hash tree to storage applications software integrity metrics, both to protect the privacy of the other party application software, and also improves the efficiency of remote attestation. Experimental prototype system shows that the model can verify the dynamic behavior of the software, to make up for the lack of static measure.

2015-05-06
Yang Xu, Zhaobo Liu, Zhuoyuan Zhang, Chao, H.J..  2014.  High-Throughput and Memory-Efficient Multimatch Packet Classification Based on Distributed and Pipelined Hash Tables. Networking, IEEE/ACM Transactions on. 22:982-995.

The emergence of new network applications, such as the network intrusion detection system and packet-level accounting, requires packet classification to report all matched rules instead of only the best matched rule. Although several schemes have been proposed recently to address the multimatch packet classification problem, most of them require either huge memory or expensive ternary content addressable memory (TCAM) to store the intermediate data structure, or they suffer from steep performance degradation under certain types of classifiers. In this paper, we decompose the operation of multimatch packet classification from the complicated multidimensional search to several single-dimensional searches, and present an asynchronous pipeline architecture based on a signature tree structure to combine the intermediate results returned from single-dimensional searches. By spreading edges of the signature tree across multiple hash tables at different stages, the pipeline can achieve a high throughput via the interstage parallel access to hash tables. To exploit further intrastage parallelism, two edge-grouping algorithms are designed to evenly divide the edges associated with each stage into multiple work-conserving hash tables. To avoid collisions involved in hash table lookup, a hybrid perfect hash table construction scheme is proposed. Extensive simulation using realistic classifiers and traffic traces shows that the proposed pipeline architecture outperforms HyperCuts and B2PC schemes in classification speed by at least one order of magnitude, while having a similar storage requirement. Particularly, with different types of classifiers of 4K rules, the proposed pipeline architecture is able to achieve a throughput between 26.8 and 93.1 Gb/s using perfect hash tables.

2015-05-05
Koyanagi, T., Shinjo, Y..  2014.  A fast and compact hybrid memory resident datastore for text analytics with autonomic memory allocation. Information and Communication Systems (ICICS), 2014 5th International Conference on. :1-7.

This paper describes a high-performance and space-efficient memory-resident datastore for text analytics systems based on a hash table for fast access, a dynamic trie for staging and a list of Level-Order Unary Degree Sequence (LOUDS) tries for compactness. We achieve efficient memory allocation and data placement by placing freqently access keys in the hash table, and infrequently accessed keys in the LOUDS tries without using conventional cache algorithms. Our algorithm also dynamically changes memory allocation sizes for these data structures according to the remaining available memory size. This technique yields 38.6% to 52.9% better throughput than a double array trie - a conventional fast and compact datastore.