CVFeb 15, 2024

What's in the Flow? Exploiting Temporal Motion Cues for Unsupervised Generic Event Boundary Detection

arXiv:2404.18935v15 citationsh-index: 4WACV
Originality Highly original
AI Analysis

This addresses the need for efficient video segmentation without large-scale training data, offering a significant improvement over existing unsupervised methods.

The paper tackled the problem of generic event boundary detection in videos by proposing a non-parametric, unsupervised method using optical flow, achieving state-of-the-art results with an F1@0.05 score of 0.713 on Kinetics-GEBD and an average F1 score of 0.623 on TAPOS.

Generic Event Boundary Detection (GEBD) task aims to recognize generic, taxonomy-free boundaries that segment a video into meaningful events. Current methods typically involve a neural model trained on a large volume of data, demanding substantial computational power and storage space. We explore two pivotal questions pertaining to GEBD: Can non-parametric algorithms outperform unsupervised neural methods? Does motion information alone suffice for high performance? This inquiry drives us to algorithmically harness motion cues for identifying generic event boundaries in videos. In this work, we propose FlowGEBD, a non-parametric, unsupervised technique for GEBD. Our approach entails two algorithms utilizing optical flow: (i) Pixel Tracking and (ii) Flow Normalization. By conducting thorough experimentation on the challenging Kinetics-GEBD and TAPOS datasets, our results establish FlowGEBD as the new state-of-the-art (SOTA) among unsupervised methods. FlowGEBD exceeds the neural models on the Kinetics-GEBD dataset by obtaining an F1@0.05 score of 0.713 with an absolute gain of 31.7% compared to the unsupervised baseline and achieves an average F1 score of 0.623 on the TAPOS validation dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes