CVRONov 6, 2020

Online Descriptor Enhancement via Self-Labelling Triplets for Visual Data Association

arXiv:2011.10471v2
AI Analysis

This addresses the challenge of improving data association for robotic applications like tracking, offering incremental gains by adapting descriptors during inference without needing large domain-specific datasets.

The paper tackles the problem of object-level visual data association in robotics by proposing a self-supervised method to incrementally refine visual descriptors online, which reduces parameters by 94% and surpasses other methods in tracking-by-detection tasks.

Object-level data association is central to robotic applications such as tracking-by-detection and object-level simultaneous localization and mapping. While current learned visual data association methods outperform hand-crafted algorithms, many rely on large collections of domain-specific training examples that can be difficult to obtain without prior knowledge. Additionally, such methods often remain fixed during inference-time and do not harness observed information to better their performance. We propose a self-supervised method for incrementally refining visual descriptors to improve performance in the task of object-level visual data association. Our method optimizes deep descriptor generators online, by continuously training a widely available image classification network pre-trained with domain-independent data. We show that earlier layers in the network outperform later-stage layers for the data association task while also allowing for a 94% reduction in the number of parameters, enabling the online optimization. We show that self-labelling challenging triplets--choosing positive examples separated by large temporal distances and negative examples close in the descriptor space--improves the quality of the learned descriptors for the multi-object tracking task. Finally, we demonstrate that our approach surpasses other visual data-association methods applied to a tracking-by-detection task, and show that it provides better performance-gains when compared to other methods that attempt to adapt to observed information.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes