CVNov 20, 2019

D3S -- A Discriminative Single Shot Segmentation Tracker

arXiv:1911.08862v2266 citations
Originality Highly original
AI Analysis

This work addresses the need for more accurate and efficient object tracking in computer vision, offering a novel approach that combines robustness and segmentation without per-dataset finetuning.

The paper tackled the problem of improving localization accuracy in visual object tracking by proposing D3S, a discriminative single-shot segmentation tracker that bridges tracking and segmentation, achieving top performance on VOT2016, VOT2018, and GOT-10k benchmarks and running close to real-time.

Template-based discriminative trackers are currently the dominant tracking paradigm due to their robustness, but are restricted to bounding box tracking and a limited range of transformation models, which reduces their localization accuracy. We propose a discriminative single-shot segmentation tracker - D3S, which narrows the gap between visual object tracking and video object segmentation. A single-shot network applies two target models with complementary geometric properties, one invariant to a broad range of transformations, including non-rigid deformations, the other assuming a rigid object to simultaneously achieve high robustness and online target segmentation. Without per-dataset finetuning and trained only for segmentation as the primary output, D3S outperforms all trackers on VOT2016, VOT2018 and GOT-10k benchmarks and performs close to the state-of-the-art trackers on the TrackingNet. D3S outperforms the leading segmentation tracker SiamMask on video object segmentation benchmark and performs on par with top video object segmentation algorithms, while running an order of magnitude faster, close to real-time.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes