Visible to the public Biblio

Filters: Keyword is multimedia retrieval  [Clear All Filters]
2022-03-08
Markchit, Sarawut.  2021.  K-mean Index Learning for Multimedia Datasets. 2021 13th International Conference on Knowledge and Smart Technology (KST). :6—11.
Currently, one method to deal with the storage and computation of multimedia retrieval applications is an approximate nearest neighbor (ANN) search. Hashing algorithms and Vector quantization (VQ) are widely used in ANN search. So, K-mean clustering is a method of VQ that can solve those problems. With the increasing growth of multimedia data such as text view, image view, video view, audio view, and 3D view. Thus, it is a reason that why multimedia retrieval is very important. We can retrieve the results of each media type by inputting a query of that type. Even though many hashing algorithms and VQ techniques are proposed to produce a compact or short binary codes. In the real-time purposes the exhaustive search is impractical, and Hamming distance computation in the Hamming space suffers inaccurate results. The challenge of this paper is focusing on how to learn multimedia raw data or features representation to search on each media type for multimedia retrieval. So we propose a new search method that utilizes K-mean hash codes by computing the probability of a cluster in the index code. The proposed employs the index code from the K-mean cluster number that is converted to hash code. The inverted index table is constructed basing on the K-mean hash code. Then we can improve the original K-mean index accuracy and efficiency by learning a deep neural network (DNN). We performed the experiments on four benchmark multimedia datasets to retrieve each view such as 3D, image, video, text, and audio, where hash codes are produced by K-mean clustering methods. Our results show the effectiveness boost the performance on the baseline (exhaustive search).
2020-05-22
Markchit, Sarawut, Chiu, Chih-Yi.  2019.  Hash Code Indexing in Cross-Modal Retrieval. 2019 International Conference on Content-Based Multimedia Indexing (CBMI). :1—4.

Cross-modal hashing, which searches nearest neighbors across different modalities in the Hamming space, has become a popular technique to overcome the storage and computation barrier in multimedia retrieval recently. Although dozens of cross-modal hashing algorithms are proposed to yield compact binary code representation, applying exhaustive search in a large-scale dataset is impractical for the real-time purpose, and the Hamming distance computation suffers inaccurate results. In this paper, we propose a novel index scheme over binary hash codes in cross-modal retrieval. The proposed indexing scheme exploits a few binary bits of the hash code as the index code. Based on the index code representation, we construct an inverted index structure to accelerate the retrieval efficiency and train a neural network to improve the indexing accuracy. Experiments are performed on two benchmark datasets for retrieval across image and text modalities, where hash codes are generated by three cross-modal hashing methods. Results show the proposed method effectively boosts the performance over the benchmark datasets and hash methods.