CVJun 8, 2015

Circulant temporal encoding for video retrieval and temporal alignment

arXiv:1506.02588v250 citations
Originality Incremental advance
AI Analysis

This work addresses video retrieval and alignment for specific events, offering incremental improvements in efficiency and accuracy for applications like multimedia search and synchronized playback.

The paper tackles the problem of retrieving and temporally aligning videos of specific events, such as a Madonna concert, by encoding frame descriptors to represent appearance and temporal order, achieving efficient comparison in the frequency domain with significant complexity gains and accurate localization of matching parts.

We address the problem of specific video event retrieval. Given a query video of a specific event, e.g., a concert of Madonna, the goal is to retrieve other videos of the same event that temporally overlap with the query. Our approach encodes the frame descriptors of a video to jointly represent their appearance and temporal order. It exploits the properties of circulant matrices to efficiently compare the videos in the frequency domain. This offers a significant gain in complexity and accurately localizes the matching parts of videos. The descriptors can be compressed in the frequency domain with a product quantizer adapted to complex numbers. In this case, video retrieval is performed without decompressing the descriptors. We also consider the temporal alignment of a set of videos. We exploit the matching confidence and an estimate of the temporal offset computed for all pairs of videos by our retrieval approach. Our robust algorithm aligns the videos on a global timeline by maximizing the set of temporally consistent matches. The global temporal alignment enables synchronous playback of the videos of a given scene.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes