CVMar 3, 2022

Revisiting Click-based Interactive Video Object Segmentation

arXiv:2203.01784v27 citationsh-index: 70
AI Analysis

This work addresses the problem of high user effort in video object segmentation for researchers and practitioners, but it is incremental as it adapts existing scribble-based methods to clicks.

The paper tackles simplifying interactive video object segmentation by proposing a click-based framework (CiVOS) to reduce user workload, achieving competitive results on the DAVIS dataset with lower effort.

While current methods for interactive Video Object Segmentation (iVOS) rely on scribble-based interactions to generate precise object masks, we propose a Click-based interactive Video Object Segmentation (CiVOS) framework to simplify the required user workload as much as possible. CiVOS builds on de-coupled modules reflecting user interaction and mask propagation. The interaction module converts click-based interactions into an object mask, which is then inferred to the remaining frames by the propagation module. Additional user interactions allow for a refinement of the object mask. The approach is extensively evaluated on the popular interactive~DAVIS dataset, but with an inevitable adaptation of scribble-based interactions with click-based counterparts. We consider several strategies for generating clicks during our evaluation to reflect various user inputs and adjust the DAVIS performance metric to perform a hardware-independent comparison. The presented CiVOS pipeline achieves competitive results, although requiring a lower user workload.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes