CVDec 7, 2020

Learning Video Instance Segmentation with Recurrent Graph Neural Networks

arXiv:2012.03911v18 citations
AI Analysis

This work tackles the challenging problem of video instance segmentation for real-time applications by formulating a purely learning-based method.

This paper proposes a purely learning-based method for video instance segmentation, modeling both temporal aspects and track management jointly. The approach, operating at over 25 FPS, outperforms previous real-time methods.

Most existing approaches to video instance segmentation comprise multiple modules that are heuristically combined to produce the final output. Formulating a purely learning-based method instead, which models both the temporal aspect as well as a generic track management required to solve the video instance segmentation task, is a highly challenging problem. In this work, we propose a novel learning formulation, where the entire video instance segmentation problem is modelled jointly. We fit a flexible model to our formulation that, with the help of a graph neural network, processes all available new information in each frame. Past information is considered and processed via a recurrent connection. We demonstrate the effectiveness of the proposed approach in comprehensive experiments. Our approach, operating at over 25 FPS, outperforms previous video real-time methods. We further conduct detailed ablative experiments that validate the different aspects of our approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes