CVMay 30, 2017

End-to-end Active Object Tracking via Reinforcement Learning

Wenhan Luo, Peng Sun, Fangwei Zhong, Wei Liu, Tong Zhang, Yizhou Wang

arXiv:1705.10561v315.297 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of active object tracking for robotics and vision systems by offering a more efficient and generalizable solution, though it is incremental as it builds on existing reinforcement learning techniques.

The paper tackles active object tracking by proposing an end-to-end deep reinforcement learning approach that directly predicts camera actions from visual frames, eliminating the need for separate tuning and reducing human effort. The method, trained in simulators, shows good generalization to unseen conditions and potential transfer to real-world scenarios, as demonstrated on the VOT dataset.

We study active object tracking, where a tracker takes as input the visual observation (i.e., frame sequence) and produces the camera control signal (e.g., move forward, turn left, etc.). Conventional methods tackle the tracking and the camera control separately, which is challenging to tune jointly. It also incurs many human efforts for labeling and many expensive trial-and-errors in realworld. To address these issues, we propose, in this paper, an end-to-end solution via deep reinforcement learning, where a ConvNet-LSTM function approximator is adopted for the direct frame-toaction prediction. We further propose an environment augmentation technique and a customized reward function, which are crucial for a successful training. The tracker trained in simulators (ViZDoom, Unreal Engine) shows good generalization in the case of unseen object moving path, unseen object appearance, unseen background, and distracting object. It can restore tracking when occasionally losing the target. With the experiments over the VOT dataset, we also find that the tracking ability, obtained solely from simulators, can potentially transfer to real-world scenarios.

View on arXiv PDF

Similar