CVJul 2, 2019

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

arXiv:1907.01203v22 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the problem of pixel-level object tracking in videos for computer vision applications, representing an incremental improvement with a novel unified framework.

The authors tackled video object segmentation by proposing a cascaded network with object proposal, tracking, and segmentation components, achieving state-of-the-art performance on benchmarks like DAVIS'17 and YouTube-VOS.

Video object segmentation (VOS) aims at pixel-level object tracking given only the annotations in the first frame. Due to the large visual variations of objects in video and the lack of training samples, it remains a difficult task despite the upsurging development of deep learning. Toward solving the VOS problem, we bring in several new insights by the proposed unified framework consisting of object proposal, tracking and segmentation components. The object proposal network transfers objectness information as generic knowledge into VOS; the tracking network identifies the target object from the proposals; and the segmentation network is performed based on the tracking results with a novel dynamic-reference based model adaptation scheme. Extensive experiments have been conducted on the DAVIS'17 dataset and the YouTube-VOS dataset, our method achieves the state-of-the-art performance on several video object segmentation benchmarks. We make the code publicly available at https://github.com/sydney0zq/PTSNet.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes