Deep Learning Based Semantic Video Indexing and Retrieval
This work addresses video search and retrieval tasks for users needing efficient access to semantic content, but it appears incremental as it builds on existing deep learning methods.
The authors tackled the problem of semantic video indexing and retrieval by using convolutional neural network features as universal signatures, and they implemented a graph-based storage structure that enables efficient retrieval with complex spatial and temporal queries.
We share the implementation details and testing results for video retrieval system based exclusively on features extracted by convolutional neural networks. We show that deep learned features might serve as universal signature for semantic content of video useful in many search and retrieval tasks. We further show that graph-based storage structure for video index allows to efficiently retrieving the content with complicated spatial and temporal search queries.