CVNov 16, 2016

One-Shot Video Object Segmentation

arXiv:1611.05198v4981 citations
Originality Highly original
AI Analysis

It solves the problem of efficient and accurate object segmentation in videos for computer vision applications, representing a strong specific gain rather than a foundational advancement.

This paper addresses semi-supervised video object segmentation by proposing OSVOS, a method that transfers semantic information to segment objects from backgrounds in videos using only a first-frame mask, achieving a state-of-the-art performance of 79.8% compared to 68.0%.

This paper tackles the task of semi-supervised video object segmentation, i.e., the separation of an object from the background in a video, given the mask of the first frame. We present One-Shot Video Object Segmentation (OSVOS), based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one-shot). Although all frames are processed independently, the results are temporally coherent and stable. We perform experiments on two annotated video segmentation databases, which show that OSVOS is fast and improves the state of the art by a significant margin (79.8% vs 68.0%).

Code Implementations8 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes