CVJan 19, 2020

Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks

arXiv:2001.06807v1294 citations
AI Analysis

This work addresses video and image segmentation tasks without manual annotations, offering a novel method that improves accuracy for computer vision applications, though it is incremental in its approach.

The authors tackled zero-shot video object segmentation by proposing an attentive graph neural network (AGNN) that models frames as nodes and uses attention for relations, achieving new state-of-the-art results on three datasets and extending it to image object co-segmentation with superior performance.

This work proposes a novel attentive graph neural network (AGNN) for zero-shot video object segmentation (ZVOS). The suggested AGNN recasts this task as a process of iterative information fusion over video graphs. Specifically, AGNN builds a fully connected graph to efficiently represent frames as nodes, and relations between arbitrary frame pairs as edges. The underlying pair-wise relations are described by a differentiable attention mechanism. Through parametric message passing, AGNN is able to efficiently capture and mine much richer and higher-order relations between video frames, thus enabling a more complete understanding of video content and more accurate foreground estimation. Experimental results on three video segmentation datasets show that AGNN sets a new state-of-the-art in each case. To further demonstrate the generalizability of our framework, we extend AGNN to an additional task: image object co-segmentation (IOCS). We perform experiments on two famous IOCS datasets and observe again the superiority of our AGNN model. The extensive experiments verify that AGNN is able to learn the underlying semantic/appearance relationships among video frames or related images, and discover the common objects.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes