Biblio
Person re-identification is an important task in video surveillance, focusing on finding the same person across different cameras. However, most existing methods of video-based person re-identification still have some limitations (e.g., the lack of effective deep learning framework, the robustness of the model, and the same treatment for all video frames) which make them unable to achieve better recognition performance. In this paper, we propose a novel self-paced learning algorithm for video-based person re-identification, which could gradually learn from simple to complex samples for a mature and stable model. Self-paced learning is employed to enhance video-based person re-identification based on deep neural network, so that deep neural network and self-paced learning are unified into one frame. Then, based on the trained self-paced learning, we propose to employ deep reinforcement learning to discard misleading and confounding frames and find the most representative frames from video pairs. With the advantage of deep reinforcement learning, our method can learn strategies to select the optimal frame groups. Experiments show that the proposed framework outperforms the existing methods on the iLIDS-VID, PRID-2011 and MARS datasets.
Hashing has shown its efficiency and effectiveness in facilitating large-scale multimedia applications. Supervised knowledge (\textbackslashemph\e.g.\, semantic labels or pair-wise relationship) associated to data is capable of significantly improving the quality of hash codes and hash functions. However, confronted with the rapid growth of newly-emerging concepts and multimedia data on the Web, existing supervised hashing approaches may easily suffer from the scarcity and validity of supervised information due to the expensive cost of manual labelling. In this paper, we propose a novel hashing scheme, termed \textbackslashemph\zero-shot hashing\ (ZSH), which compresses images of "unseen" categories to binary codes with hash functions learned from limited training data of "seen" categories. Specifically, we project independent data labels (i.e., 0/1-form label vectors) into semantic embedding space, where semantic relationships among all the labels can be precisely characterized and thus seen supervised knowledge can be transferred to unseen classes. Moreover, in order to cope with the semantic shift problem, we rotate the embedded space to more suitably align the embedded semantics with the low-level visual feature space, thereby alleviating the influence of semantic gap. In the meantime, to exert positive effects on learning high-quality hash functions, we further propose to preserve local structural property and discrete nature in binary codes. Besides, we develop an efficient alternating algorithm to solve the ZSH model. Extensive experiments conducted on various real-life datasets show the superior zero-shot image retrieval performance of ZSH as compared to several state-of-the-art hashing methods.