CVOct 21, 2016

Scalable Pooled Time Series of Big Video Data from the Deep Web

arXiv:1610.06669v11.11 citations

Originality Synthesis-oriented

AI Analysis

This work provides a scalable solution for analyzing big video data in domains like law enforcement, though it is incremental as it adapts an existing method to new data and infrastructure.

The authors tackled the challenge of scaling the Pooled Time Series algorithm for large video datasets by re-implementing it on Hadoop, and they demonstrated its effectiveness on a dataset of approximately 6800 deep web videos related to human trafficking.

We contribute a scalable implementation of Ryoo et al's Pooled Time Series algorithm from CVPR 2015. The updated algorithm has been evaluated on a large and diverse dataset of approximately 6800 videos collected from a crawl of the deep web related to human trafficking on DARPA's MEMEX effort. We describe the properties of Pooled Time Series and the motivation for using it to relate videos collected from the deep web. We highlight issues that we found while running Pooled Time Series on larger datasets and discuss solutions for those issues. Our solution centers are re-imagining Pooled Time Series as a Hadoop-based algorithm in which we compute portions of the eventual solution in parallel on large commodity clusters. We demonstrate that our new Hadoop-based algorithm works well on the 6800 video dataset and shares all of the properties described in the CVPR 2015 paper. We suggest avenues of future work in the project.

View on arXiv PDF

Similar