CVAILGNov 26, 2017

MAVOT: Memory-Augmented Video Object Tracking

arXiv:1711.09414v17 citations
Originality Incremental advance
AI Analysis

This addresses the problem of robust long-term tracking in videos for computer vision applications, but it is incremental as it builds on existing memory-augmented methods.

The paper tackles video object tracking by introducing a one-shot learning approach that uses an external memory to store and retrieve object features, enabling handling of occlusions and variations. It achieves top-5 performance in accuracy and robustness on the VOT-2016 benchmark.

We introduce a one-shot learning approach for video object tracking. The proposed algorithm requires seeing the object to be tracked only once, and employs an external memory to store and remember the evolving features of the foreground object as well as backgrounds over time during tracking. With the relevant memory retrieved and updated in each tracking, our tracking model is capable of maintaining long-term memory of the object, and thus can naturally deal with hard tracking scenarios including partial and total occlusion, motion changes and large scale and shape variations. In our experiments we use the ImageNet ILSVRC2015 video detection dataset to train and use the VOT-2016 benchmark to test and compare our Memory-Augmented Video Object Tracking (MAVOT) model. From the results, we conclude that given its oneshot property and simplicity in design, MAVOT is an attractive approach in visual tracking because it shows good performance on VOT-2016 benchmark and is among the top 5 performers in accuracy and robustness in occlusion, motion changes and empty target.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes