Visible to the public Biblio

Filters: Author is Cheng, Jian  [Clear All Filters]
2018-06-11
Hu, Qinghao, Wu, Jiaxiang, Bai, Lu, Zhang, Yifan, Cheng, Jian.  2017.  Fast K-means for Large Scale Clustering. Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. :2099–2102.

K-means algorithm has been widely used in machine learning and data mining due to its simplicity and good performance. However, the standard k-means algorithm would be quite slow for clustering millions of data into thousands of or even tens of thousands of clusters. In this paper, we propose a fast k-means algorithm named multi-stage k-means (MKM) which uses a multi-stage filtering approach. The multi-stage filtering approach greatly accelerates the k-means algorithm via a coarse-to-fine search strategy. To further speed up the algorithm, hashing is introduced to accelerate the assignment step which is the most time-consuming part in k-means. Extensive experiments on several massive datasets show that the proposed algorithm can obtain up to 600X speed-up over the k-means algorithm with comparable accuracy.

2018-01-10
Hu, Qinghao, Wu, Jiaxiang, Cheng, Jian, Wu, Lifang, Lu, Hanqing.  2017.  Pseudo Label Based Unsupervised Deep Discriminative Hashing for Image Retrieval. Proceedings of the 2017 ACM on Multimedia Conference. :1584–1590.

Hashing methods play an important role in large scale image retrieval. Traditional hashing methods use hand-crafted features to learn hash functions, which can not capture the high level semantic information. Deep hashing algorithms use deep neural networks to learn feature representation and hash functions simultaneously. Most of these algorithms exploit supervised information to train the deep network. However, supervised information is expensive to obtain. In this paper, we propose a pseudo label based unsupervised deep discriminative hashing algorithm. First, we cluster images via K-means and the cluster labels are treated as pseudo labels. Then we train a deep hashing network with pseudo labels by minimizing the classification loss and quantization loss. Experiments on two datasets demonstrate that our unsupervised deep discriminative hashing method outperforms the state-of-art unsupervised hashing methods.