CVOct 12, 2023

Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval

arXiv:2310.08009v121 citationsh-index: 38
Originality Incremental advance
AI Analysis

This work addresses video retrieval efficiency for applications like multimedia search, but it is incremental as it builds on existing unsupervised hashing techniques.

The paper tackles the problem of unsupervised video hashing by disentangling semantic extraction from reconstruction constraints, resulting in a method that consistently outperforms state-of-the-art approaches on three video benchmarks.

Unsupervised video hashing usually optimizes binary codes by learning to reconstruct input videos. Such reconstruction constraint spends much effort on frame-level temporal context changes without focusing on video-level global semantics that are more useful for retrieval. Hence, we address this problem by decomposing video information into reconstruction-dependent and semantic-dependent information, which disentangles the semantic extraction from reconstruction constraint. Specifically, we first design a simple dual-stream structure, including a temporal layer and a hash layer. Then, with the help of semantic similarity knowledge obtained from self-supervision, the hash layer learns to capture information for semantic retrieval, while the temporal layer learns to capture the information for reconstruction. In this way, the model naturally preserves the disentangled semantics into binary codes. Validated by comprehensive experiments, our method consistently outperforms the state-of-the-arts on three video benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes