CVJan 19, 2020

Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks

Wenguan Wang, Xiankai Lu, Jianbing Shen, David Crandall, Ling Shao

arXiv:2001.06807v127.5294 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses video and image segmentation tasks without manual annotations, offering a novel method that improves accuracy for computer vision applications, though it is incremental in its approach.

The authors tackled zero-shot video object segmentation by proposing an attentive graph neural network (AGNN) that models frames as nodes and uses attention for relations, achieving new state-of-the-art results on three datasets and extending it to image object co-segmentation with superior performance.

This work proposes a novel attentive graph neural network (AGNN) for zero-shot video object segmentation (ZVOS). The suggested AGNN recasts this task as a process of iterative information fusion over video graphs. Specifically, AGNN builds a fully connected graph to efficiently represent frames as nodes, and relations between arbitrary frame pairs as edges. The underlying pair-wise relations are described by a differentiable attention mechanism. Through parametric message passing, AGNN is able to efficiently capture and mine much richer and higher-order relations between video frames, thus enabling a more complete understanding of video content and more accurate foreground estimation. Experimental results on three video segmentation datasets show that AGNN sets a new state-of-the-art in each case. To further demonstrate the generalizability of our framework, we extend AGNN to an additional task: image object co-segmentation (IOCS). We perform experiments on two famous IOCS datasets and observe again the superiority of our AGNN model. The extensive experiments verify that AGNN is able to learn the underlying semantic/appearance relationships among video frames or related images, and discover the common objects.

View on arXiv PDF Code

Similar