Biblio
Cross-modal hashing, which searches nearest neighbors across different modalities in the Hamming space, has become a popular technique to overcome the storage and computation barrier in multimedia retrieval recently. Although dozens of cross-modal hashing algorithms are proposed to yield compact binary code representation, applying exhaustive search in a large-scale dataset is impractical for the real-time purpose, and the Hamming distance computation suffers inaccurate results. In this paper, we propose a novel index scheme over binary hash codes in cross-modal retrieval. The proposed indexing scheme exploits a few binary bits of the hash code as the index code. Based on the index code representation, we construct an inverted index structure to accelerate the retrieval efficiency and train a neural network to improve the indexing accuracy. Experiments are performed on two benchmark datasets for retrieval across image and text modalities, where hash codes are generated by three cross-modal hashing methods. Results show the proposed method effectively boosts the performance over the benchmark datasets and hash methods.
Cloud storage backends such as Amazon S3 are a potential storage solution to enterprises. However, to couple enterprises with these backends, at least two problems must be solved: first, how to make these semi-trusted backends as secure as on-premises storage; and second, how to selectively retrieve files as easy as on-premises storage. A security proxy can address both the problems by building a local index from keywords in files before encrypting and uploading files to these backends. But, if the local index is built in plaintext, file content is still vulnerable to local malicious staff. Searchable Encryption (SE) can get rid of this vulnerability by making index into ciphertext; however, its known constructions often require modifications to index database, and, to support wildcard queries, they are not efficient at all. In this paper, we present a security proxy that, based on our wildcard SE construction, can securely and efficiently couple enterprises with these backends. In particular, since our SE construction can work directly with existing database systems, it incurs only a little overhead, and when needed, permits the security proxy to run with constantly small storage footprint by readily out-sourcing all built indices to existing cloud databases.
Image retrieval systems have been an active area of research for more than thirty years progressively producing improved algorithms that improve performance metrics, operate in different domains, take advantage of different features extracted from the images to be retrieved, and have different desirable invariance properties. With the ever-growing visual databases of images and videos produced by a myriad of devices comes the challenge of selecting effective features and performing fast retrieval on such databases. In this paper, we incorporate Fourier descriptors (FD) along with a metric-based balanced indexing tree as a viable solution to DHS (Department of Homeland Security) needs to for quick identification and retrieval of weapon images. The FDs allow a simple but effective outline feature representation of an object, while the M-tree provide a dynamic, fast, and balanced search over such features. Motivated by looking for applications of interest to DHS, we have created a basic guns and rifles databases that can be used to identify weapons in images and videos extracted from media sources. Our simulations show excellent performance in both representation and fast retrieval speed.
The rapid development of Internet has resulted in massive information overloading recently. These information is usually represented by high-dimensional feature vectors in many related applications such as recognition, classification and retrieval. These applications usually need efficient indexing and search methods for such large-scale and high-dimensional database, which typically is a challenging task. Some efforts have been made and solved this problem to some extent. However, most of them are implemented in a single machine, which is not suitable to handle large-scale database.In this paper, we present a novel data index structure and nearest neighbor search algorithm implemented on Apache Spark. We impose a grid on the database and index data by non-empty grid cells. This grid-based index structure is simple and easy to be implemented in parallel. Moreover, we propose to build a scalable KNN graph on the grids, which increase the efficiency of this index structure by a low cost in parallel implementation. Finally, experiments are conducted in both public databases and synthetic databases, showing that the proposed methods achieve overall high performance in both efficiency and accuracy.
With increasing popularity of cloud computing, the data owners are motivated to outsource their sensitive data to cloud servers for flexibility and reduced cost in data management. However, privacy is a big concern for outsourcing data to the cloud. The data owners typically encrypt documents before outsourcing for privacy-preserving. As the volume of data is increasing at a dramatic rate, it is essential to develop an efficient and reliable ciphertext search techniques, so that data owners can easily access and update cloud data. In this paper, we propose a privacy preserving multi-keyword ranked search scheme over encrypted data in cloud along with data integrity using a new authenticated data structure MIR-tree. The MIR-tree based index with including the combination of widely used vector space model and TF×IDF model in the index construction and query generation. We use inverted file index for storing word-digest, which provides efficient and fast relevance between the query and cloud data. Design an authentication set(AS) for authenticating the queries, for verifying top-k search results. Because of tree based index, our scheme achieves optimal search efficiency and reduces communication overhead for verifying the search results. The analysis shows security and efficiency of our scheme.
Recently, cloud computing has been spotlighted as a new paradigm of database management system. In this environment, databases are outsourced and deployed on a service provider in order to reduce cost for data storage and maintenance. However, the service provider might be untrusted so that the two issues of data security, including data confidentiality and query result integrity, become major concerns for users. Existing bucket-based data authentication methods have problem that the original spatial data distribution can be disclosed from data authentication index due to the unsophisticated data grouping strategies. In addition, the transmission overhead of verification object is high. In this paper, we propose a privacy-aware query authentication which guarantees data confidentiality and query result integrity for users. A periodic function-based data grouping scheme is designed to privately partition a spatial database into small groups for generating a signature of each group. The group signature is used to check the correctness and completeness of outsourced data when answering a range query to users. Through performance evaluation, it is shown that proposed method outperforms the existing method in terms of range query processing time up to 3 times.
Recently, cloud computing has been spotlighted as a new paradigm of database management system. In this environment, databases are outsourced and deployed on a service provider in order to reduce cost for data storage and maintenance. However, the service provider might be untrusted so that the two issues of data security, including data confidentiality and query result integrity, become major concerns for users. Existing bucket-based data authentication methods have problem that the original spatial data distribution can be disclosed from data authentication index due to the unsophisticated data grouping strategies. In addition, the transmission overhead of verification object is high. In this paper, we propose a privacy-aware query authentication which guarantees data confidentiality and query result integrity for users. A periodic function-based data grouping scheme is designed to privately partition a spatial database into small groups for generating a signature of each group. The group signature is used to check the correctness and completeness of outsourced data when answering a range query to users. Through performance evaluation, it is shown that proposed method outperforms the existing method in terms of range query processing time up to 3 times.
Recently, cloud computing has been spotlighted as a new paradigm of database management system. In this environment, databases are outsourced and deployed on a service provider in order to reduce cost for data storage and maintenance. However, the service provider might be untrusted so that the two issues of data security, including data confidentiality and query result integrity, become major concerns for users. Existing bucket-based data authentication methods have problem that the original spatial data distribution can be disclosed from data authentication index due to the unsophisticated data grouping strategies. In addition, the transmission overhead of verification object is high. In this paper, we propose a privacy-aware query authentication which guarantees data confidentiality and query result integrity for users. A periodic function-based data grouping scheme is designed to privately partition a spatial database into small groups for generating a signature of each group. The group signature is used to check the correctness and completeness of outsourced data when answering a range query to users. Through performance evaluation, it is shown that proposed method outperforms the existing method in terms of range query processing time up to 3 times.