Biblio
Nearest neighbor search algorithm plays a very important role in computer image algorithm. When the search data is large, we need to use fast search algorithm. The current fast retrieval algorithms are tree based algorithms. The efficiency of the tree algorithm decreases sharply with the increase of the data dimension. In this paper, a local integral hash nearest neighbor algorithm of the spatial space is proposed to construct the tree structure by changing the way of the node of the access tree. It is able to express data distribution characteristics. After experimental testing, this paper achieves more efficient performance in high dimensional data.
This paper focuses on a practically very important problem of matching a real-world product photo to exactly the same item(s) in online shopping sites. The task is extremely challenging because the user photos (i.e., the queries in this scenario) are often captured in uncontrolled environments, while the product images in online shops are mostly taken by professionals with clean backgrounds and perfect lighting conditions. To tackle the problem, we study deep network architectures and training schemes, with the goal of learning a robust deep feature representation that is able to bridge the domain gap between the user photos and the online product images. Our contributions are two-fold. First, we propose an alternative of the popular contrastive loss used in siamese deep networks, namely robust contrastive loss, where we "relax" the penalty on positive pairs to alleviate over-fitting. Second, a multi-task fine-tuning approach is introduced to learn a better feature representation, which not only incorporates knowledge from the provided training photo pairs, but also explores additional information from the large ImageNet dataset to regularize the fine-tuning procedure. Experiments on two challenging real-world datasets demonstrate that both the robust contrastive loss and the multi-task fine-tuning approach are effective, leading to very promising results with a time cost suitable for real-time retrieval.