CVNov 20, 2021

FlowVOS: Weakly-Supervised Visual Warping for Detail-Preserving and Temporally Consistent Single-Shot Video Object Segmentation

arXiv:2111.10621v1
Originality Incremental advance
AI Analysis

This work addresses the problem of producing detailed and temporally consistent segmentations in video object segmentation for computer vision applications, representing an incremental improvement over prior methods.

The paper tackles semi-supervised video object segmentation by introducing a foreground-targeted visual warping method to improve detail preservation and temporal consistency, achieving state-of-the-art performance on DAVIS17 and YouTubeVOS benchmarks without using extra data.

We consider the task of semi-supervised video object segmentation (VOS). Our approach mitigates shortcomings in previous VOS work by addressing detail preservation and temporal consistency using visual warping. In contrast to prior work that uses full optical flow, we introduce a new foreground-targeted visual warping approach that learns flow fields from VOS data. We train a flow module to capture detailed motion between frames using two weakly-supervised losses. Our object-focused approach of warping previous foreground object masks to their positions in the target frame enables detailed mask refinement with fast runtimes without using extra flow supervision. It can also be integrated directly into state-of-the-art segmentation networks. On the DAVIS17 and YouTubeVOS benchmarks, we outperform state-of-the-art offline methods that do not use extra data, as well as many online methods that use extra data. Qualitatively, we also show our approach produces segmentations with high detail and temporal consistency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes