CVApr 27

OmniShotCut: Holistic Relational Shot Boundary Detection with Shot-Query Transformer

arXiv:2604.2476281.7
AI Analysis

This work addresses the need for more accurate and interpretable shot boundary detection in video analysis, a task critical for video editing and understanding.

OmniShotCut formulates shot boundary detection as structured relational prediction using a shot-query Transformer, achieving state-of-the-art performance with 95.3% F1 on a new synthetic benchmark, outperforming prior methods by 5-10%.

Shot Boundary Detection (SBD) aims to automatically identify shot changes and divide a video into coherent shots. While SBD was widely studied in the literature, existing state-of-the-art methods often produce non-interpretable boundaries on transitions, miss subtle yet harmful discontinuities, and rely on noisy, low-diversity annotations and outdated benchmarks. To alleviate these limitations, we propose OmniShotCut to formulate SBD as structured relational prediction, jointly estimating shot ranges with intra-shot relations and inter-shot relations, by a shot query-based dense video Transformer. To avoid imprecise manual labeling, we adopt a fully synthetic transition synthesis pipeline that automatically reproduces major transition families with precise boundaries and parameterized variants. We also introduce OmniShotCutBench, a modern wide-domain benchmark enabling holistic and diagnostic evaluation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes