Biblio
Analysis of the state of development of research on the protection of valuable scientific and educational databases, library resources, information centers, publishers show the importance of information security, especially in corporate information networks and systems for data exchange. Corporate library networks include dozens and even hundreds of libraries for active information exchange, and they (libraries) are equipped with information security tools to varying degrees. The purpose of the research is to create effective methods and tools to protect the databases of the scientific and educational resources from unauthorized access in libraries and library networks using fuzzy logic methods.
With the development of mobile internet technology, GPS technology and social software have been widely used in people's lives. The problem of big data privacy protection related to location trajectory is becoming more and more serious. The traditional location trajectory privacy protection method requires certain background knowledge and it is difficult to adapt to massive mass. Privacy protection of data. differential privacy protection technology protects privacy by attacking data by randomly perturbing raw data. The method used in this paper is to first sample the position trajectory, form the irregular polygons of the high-frequency access points in the sampling points and position data, calculate the center of gravity of the polygon, and then use the differential privacy protection algorithm to add noise to the center of gravity of the polygon to form a new one. The center of gravity, and the new center of gravity are connected to form a new trajectory. The purpose of protecting the position trajectory is well achieved. It is proved that the differential privacy protection algorithm can effectively protect the position trajectory by adding noise.
The main objective of this paper is to present a more secured and computationally efficient procedure of encrypting and decrypting images using the enigma algorithm in comparison to the existing methods. Available literature on image encryptions and descriptions are not highly secured in every case.To achieve more secured image processing for highly advanced technologies, a proposed algorithm can be the process used in enigma machine for image encryption and decryption. Enigma machine is piece of spook hardware that was used frequently during the World War II by the Germans. This paper describes the detailed algorithm along with proper demonstration of several essential components present in an enigma machine that is required for image security. Each pixel in a colorful picture can be represented by RGB (Red, Green, Blue) value. The range of RGB values is 0 to 255 that states the red, green and blue intensity of a particular picture.These RGB values are accessed one by one and changed into another by various steps and hence it is not possible to track the original RGB value. In order to retrieve the original image, the receiver needs to know the setting of the enigma. To compare the decrypted image with the original one,these two images are subtracted and their results are also discussed in this paper.
This article will consider the probability test of Solovey-Strassen, to determine the simplicity of the number and its possible modifications. This test allows for the shortest possible time to determine whether the number is prime or not. C\# programming language was used to implement the algorithm in practice.
The present age of digital information has presented a heterogeneous online environment which makes it a formidable mission for a noble user to search and locate the required online resources timely. Recommender systems were implemented to rescue this information overload issue. However, majority of recommendation algorithms focused on the accuracy of the recommendations, leaving out other important aspects in the definition of good recommendation such as diversity and serendipity. This results in low coverage, long-tail items often are left out in the recommendations as well. In this paper, we present and explore a recommendation technique that ensures that diversity, accuracy and serendipity are all factored in the recommendations. The proposed algorithm performed comparatively well as compared to other algorithms in literature.
The keys generated by (symmetric or asymmetric) have been still compromised by attackers. Cryptography algorithms need extra efforts to enhance the security of keys that are transferring between parities. Also, using cryptography algorithms increase time consumption and overhead cost through communication. Encryption is very important issue for protecting information from stealing. Unfortunately encryption can achieve confidentiality not integrity. Covert channel allows two parties to indirectly send information, where the main drawbacks of covert channel are detectability and the security of pre-agreement knowledge. In this paper, i merge between encryption, authentication and convert channel to achieve un-detectability covert channel. This channel guarantee integrity and confidentiality of covert data and sending data dynamically. I propose and implement un-detectability a covert channel using AES (Advanced Encryption Standard) algorithm and HMAC (Hashed Message Authentication Code). Where this channel is un-detectability with integrity and confidentiality agreement process between the sender and the receiver. Instead of sending fake key directly through channel, encryption and HMAC function used to hide fake key. After that investigations techniques for improving un-detectability of channel is proposed.
Malware detection has been widely studied by analysing either file dropping relationships or characteristics of the file distribution network. This paper, for the first time, studies a global heterogeneous malware delivery graph fusing file dropping relationship and the topology of the file distribution network. The integration offers a unique ability of structuring the end-to-end distribution relationship. However, it brings large heterogeneous graphs to analysis. In our study, an average daily generated graph has more than 4 million edges and 2.7 million nodes that differ in type, such as IPs, URLs, and files. We propose a novel Bayesian label propagation model to unify the multi-source information, including content-agnostic features of different node types and topological information of the heterogeneous network. Our approach does not need to examine the source codes nor inspect the dynamic behaviours of a binary. Instead, it estimates the maliciousness of a given file through a semi-supervised label propagation procedure, which has a linear time complexity w.r.t. the number of nodes and edges. The evaluation on 567 million real-world download events validates that our proposed approach efficiently detects malware with a high accuracy.
We study the problem of estimating distinct elements in the data stream model, which has a central role in traffic monitoring, query optimization, data mining and data integration. Different from all previous work, we study the problem in the noisy data setting, where two different looking items in the stream may reference the same entity (determined by a distance function and a threshold value), and the goal is to estimate the number of distinct entities in the stream. In this paper, we formalize the problem of robust distinct elements, and develop space and time-efficient streaming algorithms for datasets in the Euclidean space, using a novel technique we call bucket sampling. We also extend our algorithmic framework to other metric spaces by establishing a connection between bucket sampling and the theory of locality sensitive hashing. Moreover, we formally prove that our algorithms are still effective under small distinct elements ambiguity. Our experiments demonstrate the practicality of our algorithms.
The use of flashSSDs has increased rapidly in a wide range of areas due to their superior energy efficiency, shorter access time, and higher bandwidth when compared to HDDs. The internal parallelism created by multiple flash memory packages embedded in a flashSSDs, is one of the unique features of flashSSDs. Many new DBMS technologies have been developed for flashSSDs, but query processing for flashSSDs have drawn less attention than other DBMS technologies. Hash partitioning is popularly used in query processing algorithms to materialize their intermediate results in an efficient manner. In this paper, we propose a novel hash partitioning algorithm that exploits the internal parallelism of flashSSDs. The devised hash partitioning method outperforms the traditional hash partitioning technique regardless of the amount of available main memory independently from the buffer management strategies (blocked I/O vs page sized I/O). We implemented our method based on the source code of the PostgreSQL storage manager. PostgreSQL relation files created by the TPC-H workload were employed in the experiments. Our method was found to be up to 3.55 times faster than the traditional method with blocked I/O, and 2.36 times faster than the traditional method with pagesized I/O.
We propose a novel, scalable, and principled graph sketching technique based on minwise hashing of local neighborhood. For an n-node graph with e-edges (e textgreatertextgreater n), we incrementally maintain in real-time a minwise neighbor sampled subgraph using k hash functions in O(n x k) memory, limit being user-configurable by the parameter k. Symmetrization and similarity based techniques can recover from these data structures a significant portion of the original graph. We present theoretical analysis of the minwise sampling strategy and also derive unbiased estimators for important graph properties such as triangle count and neighborhood overlap. We perform an extensive empirical evaluation of our graph sketch and it's derivatives on a wide variety of real-world graph data sets drawn from different application domains using important large network analysis algorithms: local and global clustering coefficient, PageRank, and local graph sparsification. With bounded memory, the quality of results using the sketch representation is competitive against baselines which use the full graph, and the computational performance is often better. Our framework is flexible and configurable to be leveraged by numerous other graph analytics algorithms, potentially reducing the information mining time on large streamed graphs for a variety of applications.
Recently, multimodal hashing techniques have received considerable attention due to their low storage cost and fast query speed for multimodal data retrieval. Many methods have been proposed; however, there are still some problems that need to be further considered. For example, some of these methods just use a similarity matrix for learning hash functions which will discard some useful information contained in original data; some of them relax binary constraints or separate the process of learning hash functions and binary codes into two independent stages to bypass the obstacle of handling the discrete constraints on binary codes for optimization, which may generate large quantization error; some of them are not robust to noise. All these problems may degrade the performance of a model. To consider these problems, in this paper, we propose a novel supervised hashing framework for cross-modal retrieval, i.e., Supervised Robust Discrete Multimodal Hashing (SRDMH). Specifically, SRDMH tries to make final binary codes preserve label information as same as that in original data so that it can leverage more label information to supervise the binary codes learning. In addition, it learns hashing functions and binary codes directly instead of relaxing the binary constraints so as to avoid large quantization error problem. Moreover, to make it robust and easy to solve, we further integrate a flexible l2,p loss with nonlinear kernel embedding and an intermediate presentation of each instance. Finally, an alternating algorithm is proposed to solve the optimization problem in SRDMH. Extensive experiments are conducted on three benchmark data sets. The results demonstrate that the proposed method (SRDMH) outperforms or is comparable to several state-of-the-art methods for cross-modal retrieval task.
Hashing based methods have attracted considerable attention for efficient cross-modal retrieval on large-scale multimedia data. The core problem of cross-modal hashing is how to effectively integrate heterogeneous features from different modalities to learn hash functions using available supervising information, e.g., class labels. Existing hashing based methods generally project heterogeneous features to a common space for hash codes generation, and the supervising information is incrementally used for improving performance. However, these methods may produce ineffective hash codes, due to the failure to explore the discriminative property of supervising information and to effectively bridge the semantic gap between different modalities. To address these challenges, we propose a novel hashing based method in a linear classification framework, in which the proposed method learns modality-specific hash functions for generating unified binary codes, and these binary codes are viewed as representative features for discriminative classification with class labels. An effective optimization algorithm is developed for the proposed method to jointly learn the modality-specific hash function, the unified binary codes and a linear classifier. Extensive experiments on three benchmark datasets highlight the advantage of the proposed method and show that it achieves the state-of-the-art performance.