Visible to the public Biblio

Filters: Author is Sebe, Nicu  [Clear All Filters]
2018-11-19
Duta, Ionut C., Ionescu, Bogdan, Aizawa, Kiyoharu, Sebe, Nicu.  2017.  Simple, Efficient and Effective Encodings of Local Deep Features for Video Action Recognition. Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval. :218–225.

For an action recognition system a decisive component is represented by the feature encoding part which builds the final representation that serves as input to a classifier. One of the shortcomings of the existing encoding approaches is the fact that they are built around hand-crafted features and they are not also highly competitive on encoding the current deep features, necessary in many practical scenarios. In this work we propose two solutions specifically designed for encoding local deep features, taking advantage of the nature of deep networks, focusing on capturing the highest feature response of the convolutional maps. The proposed approaches for deep feature encoding provide a solution to encapsulate the features extracted with a convolutional neural network over the entire video. In terms of accuracy our encodings outperform by a large margin the current most widely used and powerful encoding approaches, while being extremely efficient for the computational cost. Evaluated in the context of action recognition tasks, our pipeline obtains state-of-the-art results on three challenging datasets: HMDB51, UCF50 and UCF101.