Biblio
Scientific experiments and observations store massive amounts of data in various scientific file formats. Metadata, which describes the characteristics of the data, is commonly used to sift through massive datasets in order to locate data of interest to scientists. Several indexing data structures (such as hash tables, trie, self-balancing search trees, sparse array, etc.) have been developed as part of efforts to provide an efficient method for locating target data. However, efficient determination of an indexing data structure remains unclear in the context of scientific data management, due to the lack of investigation on metadata, metadata queries, and corresponding data structures. In this study, we perform a systematic study of the metadata search essentials in the context of scientific data management. We study a real-world astronomy observation dataset and explore the characteristics of the metadata in the dataset. We also study possible metadata queries based on the discovery of the metadata characteristics and evaluate different data structures for various types of metadata attributes. Our evaluation on real-world dataset suggests that trie is a suitable data structure when prefix/suffix query is required, otherwise hash table should be used. We conclude our study with a summary of our findings. These findings provide a guideline and offers insights in developing metadata indexing methodologies for scientific applications.
In recent years, real-world attacks against PKI take place frequently. For example, malicious domains' certificates issued by compromised CAs are widespread, and revoked certificates are still trusted by clients. In spite of a lot of research to improve the security of SSL/TLS connections, there are still some problems unsolved. On one hand, although log-based schemes provided certificate audit service to quickly detect CAs' misbehavior, the security and data consistency of log servers are ignored. On the other hand, revoked certificates checking is neglected due to the incomplete, insecure and inefficient certificate revocation mechanisms. Further, existing revoked certificates checking schemes are centralized which would bring safety bottlenecks. In this paper, we propose a blockchain-based public and efficient audit scheme for TLS connections, which is called Certchain. Specially, we propose a dependability-rank based consensus protocol in our blockchain system and a new data structure to support certificate forward traceability. Furthermore, we present a method that utilizes dual counting bloom filter (DCBF) with eliminating false positives to achieve economic space and efficient query for certificate revocation checking. The security analysis and experimental results demonstrate that CertChain is suitable in practice with moderate overhead.
Genetic Algorithms are group of mathematical models in computational science by exciting evolution in AI techniques nowadays. These algorithms preserve critical information by applying data structure with simple chromosome recombination operators by encoding solution to a specific problem. Genetic algorithms they are optimizer, in which range of problems applied to it are quite broad. Genetic Algorithms with its global search includes basic principles like selection, crossover and mutation. Data structures, algorithms and human brain inspiration are found for classification of data and for learning which works using Neural Networks. Artificial Intelligence (AI) it is a field, where so many tasks performed naturally by a human. When AI conventional methods are used in a computer it was proved as a complicated task. Applying Neural Networks techniques will create an internal structure of rules by which a program can learn by examples, to classify different inputs than mining techniques. This paper proposes a phishing websites classifier using improved polynomial neural networks in genetic algorithm.