Biblio | CPS-VO

Liu, Qin, Pei, Shuyu, Xie, Kang, Wu, Jie, Peng, Tao, Wang, Guojun. 2018. Achieving Secure and Effective Search Services in Cloud Computing. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :1386–1391.

One critical challenge of today's cloud services is how to provide an effective search service while preserving user privacy. In this paper, we propose a wildcard-based multi-keyword fuzzy search (WMFS) scheme over the encrypted data, which tolerates keyword misspellings by exploiting the indecomposable property of primes. Compared with existing secure fuzzy search schemes, our WMFS scheme has the following merits: 1) Efficiency. It eliminates the requirement of a predefined dictionary and thus supports updates efficiently. 2) High accuracy. It eliminates the false positive and false negative introduced by specific data structures and thus allows the user to retrieve files as accurate as possible. 3) Flexibility. It gives the user great flexibility to specify different search patterns including keyword and substring matching. Extensive experiments on a real data set demonstrate the effectiveness and efficiency of our scheme.

Markchit, Sarawut, Chiu, Chih-Yi. 2019. Hash Code Indexing in Cross-Modal Retrieval. 2019 International Conference on Content-Based Multimedia Indexing (CBMI). :1—4.

Cross-modal hashing, which searches nearest neighbors across different modalities in the Hamming space, has become a popular technique to overcome the storage and computation barrier in multimedia retrieval recently. Although dozens of cross-modal hashing algorithms are proposed to yield compact binary code representation, applying exhaustive search in a large-scale dataset is impractical for the real-time purpose, and the Hamming distance computation suffers inaccurate results. In this paper, we propose a novel index scheme over binary hash codes in cross-modal retrieval. The proposed indexing scheme exploits a few binary bits of the hash code as the index code. Based on the index code representation, we construct an inverted index structure to accelerate the retrieval efficiency and train a neural network to improve the indexing accuracy. Experiments are performed on two benchmark datasets for retrieval across image and text modalities, where hash codes are generated by three cross-modal hashing methods. Results show the proposed method effectively boosts the performance over the benchmark datasets and hash methods.

Chung, S., Shieh, M., Chiueh, T.. 2018. A Security Proxy to Cloud Storage Backends Based on an Efficient Wildcard Searchable Encryption. 2018 IEEE 8th International Symposium on Cloud and Service Computing (SC2). :127-130.

Cloud storage backends such as Amazon S3 are a potential storage solution to enterprises. However, to couple enterprises with these backends, at least two problems must be solved: first, how to make these semi-trusted backends as secure as on-premises storage; and second, how to selectively retrieve files as easy as on-premises storage. A security proxy can address both the problems by building a local index from keywords in files before encrypting and uploading files to these backends. But, if the local index is built in plaintext, file content is still vulnerable to local malicious staff. Searchable Encryption (SE) can get rid of this vulnerability by making index into ciphertext; however, its known constructions often require modifications to index database, and, to support wildcard queries, they are not efficient at all. In this paper, we present a security proxy that, based on our wildcard SE construction, can securely and efficiently couple enterprises with these backends. In particular, since our SE construction can work directly with existing database systems, it incurs only a little overhead, and when needed, permits the security proxy to run with constantly small storage footprint by readily out-sourcing all built indices to existing cloud databases.

Quweider, M., Lei, H., Zhang, L., Khan, F.. 2018. Managing Big Data in Visual Retrieval Systems for DHS Applications: Combining Fourier Descriptors and Metric Space Indexing. 2018 1st International Conference on Data Intelligence and Security (ICDIS). :188-193.

Image retrieval systems have been an active area of research for more than thirty years progressively producing improved algorithms that improve performance metrics, operate in different domains, take advantage of different features extracted from the images to be retrieved, and have different desirable invariance properties. With the ever-growing visual databases of images and videos produced by a myriad of devices comes the challenge of selecting effective features and performing fast retrieval on such databases. In this paper, we incorporate Fourier descriptors (FD) along with a metric-based balanced indexing tree as a viable solution to DHS (Department of Homeland Security) needs to for quick identification and retrieval of weapon images. The FDs allow a simple but effective outline feature representation of an object, while the M-tree provide a dynamic, fast, and balanced search over such features. Motivated by looking for applications of interest to DHS, we have created a basic guns and rifles databases that can be used to identify weapons in images and videos extracted from media sources. Our simulations show excellent performance in both representation and fast retrieval speed.

Yang, C., Li, Z., Qu, W., Liu, Z., Qi, H.. 2017. Grid-Based Indexing and Search Algorithms for Large-Scale and High-Dimensional Data. 2017 14th International Symposium on Pervasive Systems, Algorithms and Networks 2017 11th International Conference on Frontier of Computer Science and Technology 2017 Third International Symposium of Creative Computing (ISPAN-FCST-ISCC). :46–51.

The rapid development of Internet has resulted in massive information overloading recently. These information is usually represented by high-dimensional feature vectors in many related applications such as recognition, classification and retrieval. These applications usually need efficient indexing and search methods for such large-scale and high-dimensional database, which typically is a challenging task. Some efforts have been made and solved this problem to some extent. However, most of them are implemented in a single machine, which is not suitable to handle large-scale database.In this paper, we present a novel data index structure and nearest neighbor search algorithm implemented on Apache Spark. We impose a grid on the database and index data by non-empty grid cells. This grid-based index structure is simple and easy to be implemented in parallel. Moreover, we propose to build a scalable KNN graph on the grids, which increase the efficiency of this index structure by a low cost in parallel implementation. Finally, experiments are conducted in both public databases and synthetic databases, showing that the proposed methods achieve overall high performance in both efficiency and accuracy.

Gurjar, S. P. S., Pasupuleti, S. K.. 2016. A privacy-preserving multi-keyword ranked search scheme over encrypted cloud data using MIR-tree. 2016 International Conference on Computing, Analytics and Security Trends (CAST). :533–538.

With increasing popularity of cloud computing, the data owners are motivated to outsource their sensitive data to cloud servers for flexibility and reduced cost in data management. However, privacy is a big concern for outsourcing data to the cloud. The data owners typically encrypt documents before outsourcing for privacy-preserving. As the volume of data is increasing at a dramatic rate, it is essential to develop an efficient and reliable ciphertext search techniques, so that data owners can easily access and update cloud data. In this paper, we propose a privacy preserving multi-keyword ranked search scheme over encrypted data in cloud along with data integrity using a new authenticated data structure MIR-tree. The MIR-tree based index with including the combination of widely used vector space model and TF×IDF model in the index construction and query generation. We use inverted file index for storing word-digest, which provides efficient and fast relevance between the query and cloud data. Design an authentication set(AS) for authenticating the queries, for verifying top-k search results. Because of tree based index, our scheme achieves optimal search efficiency and reduces communication overhead for verifying the search results. The analysis shows security and efficiency of our scheme.

Miyoung Jang, Min Yoon, Jae-Woo Chang. 2014. A privacy-aware query authentication index for database outsourcing. Big Data and Smart Computing (BIGCOMP), 2014 International Conference on. :72-76.

Recently, cloud computing has been spotlighted as a new paradigm of database management system. In this environment, databases are outsourced and deployed on a service provider in order to reduce cost for data storage and maintenance. However, the service provider might be untrusted so that the two issues of data security, including data confidentiality and query result integrity, become major concerns for users. Existing bucket-based data authentication methods have problem that the original spatial data distribution can be disclosed from data authentication index due to the unsophisticated data grouping strategies. In addition, the transmission overhead of verification object is high. In this paper, we propose a privacy-aware query authentication which guarantees data confidentiality and query result integrity for users. A periodic function-based data grouping scheme is designed to privately partition a spatial database into small groups for generating a signature of each group. The group signature is used to check the correctness and completeness of outsourced data when answering a range query to users. Through performance evaluation, it is shown that proposed method outperforms the existing method in terms of range query processing time up to 3 times.

Miyoung Jang, Min Yoon, Jae-Woo Chang. 2014. A privacy-aware query authentication index for database outsourcing. Big Data and Smart Computing (BIGCOMP), 2014 International Conference on. :72-76.

Recently, cloud computing has been spotlighted as a new paradigm of database management system. In this environment, databases are outsourced and deployed on a service provider in order to reduce cost for data storage and maintenance. However, the service provider might be untrusted so that the two issues of data security, including data confidentiality and query result integrity, become major concerns for users. Existing bucket-based data authentication methods have problem that the original spatial data distribution can be disclosed from data authentication index due to the unsophisticated data grouping strategies. In addition, the transmission overhead of verification object is high. In this paper, we propose a privacy-aware query authentication which guarantees data confidentiality and query result integrity for users. A periodic function-based data grouping scheme is designed to privately partition a spatial database into small groups for generating a signature of each group. The group signature is used to check the correctness and completeness of outsourced data when answering a range query to users. Through performance evaluation, it is shown that proposed method outperforms the existing method in terms of range query processing time up to 3 times.

Miyoung Jang, Min Yoon, Jae-Woo Chang. 2014. A privacy-aware query authentication index for database outsourcing. Big Data and Smart Computing (BIGCOMP), 2014 International Conference on. :72-76.

Recently, cloud computing has been spotlighted as a new paradigm of database management system. In this environment, databases are outsourced and deployed on a service provider in order to reduce cost for data storage and maintenance. However, the service provider might be untrusted so that the two issues of data security, including data confidentiality and query result integrity, become major concerns for users. Existing bucket-based data authentication methods have problem that the original spatial data distribution can be disclosed from data authentication index due to the unsophisticated data grouping strategies. In addition, the transmission overhead of verification object is high. In this paper, we propose a privacy-aware query authentication which guarantees data confidentiality and query result integrity for users. A periodic function-based data grouping scheme is designed to privately partition a spatial database into small groups for generating a signature of each group. The group signature is used to check the correctness and completeness of outsourced data when answering a range query to users. Through performance evaluation, it is shown that proposed method outperforms the existing method in terms of range query processing time up to 3 times.