CV AI LGNov 26, 2017

MAVOT: Memory-Augmented Video Object Tracking

Boyu Liu, Yanzhao Wang, Yu-Wing Tai, Chi-Keung Tang

arXiv:1711.09414v15.07 citations

Originality Incremental advance

AI Analysis

This addresses the problem of robust long-term tracking in videos for computer vision applications, but it is incremental as it builds on existing memory-augmented methods.

The paper tackles video object tracking by introducing a one-shot learning approach that uses an external memory to store and retrieve object features, enabling handling of occlusions and variations. It achieves top-5 performance in accuracy and robustness on the VOT-2016 benchmark.

We introduce a one-shot learning approach for video object tracking. The proposed algorithm requires seeing the object to be tracked only once, and employs an external memory to store and remember the evolving features of the foreground object as well as backgrounds over time during tracking. With the relevant memory retrieved and updated in each tracking, our tracking model is capable of maintaining long-term memory of the object, and thus can naturally deal with hard tracking scenarios including partial and total occlusion, motion changes and large scale and shape variations. In our experiments we use the ImageNet ILSVRC2015 video detection dataset to train and use the VOT-2016 benchmark to test and compare our Memory-Augmented Video Object Tracking (MAVOT) model. From the results, we conclude that given its oneshot property and simplicity in design, MAVOT is an attractive approach in visual tracking because it shows good performance on VOT-2016 benchmark and is among the top 5 performers in accuracy and robustness in occlusion, motion changes and empty target.

View on arXiv PDF

Similar