CVJul 16, 2020

Interactive Video Object Segmentation Using Global and Local Transfer Modules

arXiv:2007.08139v143 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the problem of reducing user effort in video object segmentation for applications like video editing, but it is incremental as it builds on existing interactive methods.

The paper tackles interactive video object segmentation by proposing a deep neural network with annotation and transfer modules, achieving state-of-the-art performance compared to conventional algorithms.

An interactive video object segmentation algorithm, which takes scribble annotations on query objects as input, is proposed in this paper. We develop a deep neural network, which consists of the annotation network (A-Net) and the transfer network (T-Net). First, given user scribbles on a frame, A-Net yields a segmentation result based on the encoder-decoder architecture. Second, T-Net transfers the segmentation result bidirectionally to the other frames, by employing the global and local transfer modules. The global transfer module conveys the segmentation information in an annotated frame to a target frame, while the local transfer module propagates the segmentation information in a temporally adjacent frame to the target frame. By applying A-Net and T-Net alternately, a user can obtain desired segmentation results with minimal efforts. We train the entire network in two stages, by emulating user scribbles and employing an auxiliary loss. Experimental results demonstrate that the proposed interactive video object segmentation algorithm outperforms the state-of-the-art conventional algorithms. Codes and models are available at https://github.com/yuk6heo/IVOS-ATNet.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes