Scalable Hadoop-Based Pooled Time Series of Big Video Data from the Deep Web
Title | Scalable Hadoop-Based Pooled Time Series of Big Video Data from the Deep Web |
Publication Type | Conference Paper |
Year of Publication | 2017 |
Authors | Mattmann, Chris A., Sharan, Madhav |
Conference Name | Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval |
Date Published | June 2017 |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 978-1-4503-4701-3 |
Keywords | dark web, darpa, Hadoop, Human Behavior, human factors, memex, pooled time series, pubcrawl, video |
Abstract | We contribute a scalable, open source implementation of the Pooled Time Series (PoT) algorithm from CVPR 2015. The algorithm is evaluated on approximately 6800 human trafficking (HT) videos collected from the deep and dark web, and on an open dataset: the Human Motion Database (HMDB). We describe PoT and our motivation for using it on larger data and the issues we encountered. Our new solution reimagines PoT as an Apache Hadoop-based algorithm. We demonstrate that our new Hadoop-based algorithm successfully identifies similar videos in the HT and HMDB datasets and we evaluate the algorithm qualitatively and quantitatively. |
URL | http://doi.acm.org/10.1145/3078971.3079019 |
DOI | 10.1145/3078971.3079019 |
Citation Key | mattmann_scalable_2017 |