RSPT: Reconstruct Surroundings and Predict Trajectories for Generalizable Active Object Tracking
This addresses the problem of robust active object tracking for applications like mobile robots and autonomous driving, but it appears incremental as it builds on existing methods with a novel hybrid approach.
The paper tackles the challenge of building a generalizable active object tracker that works robustly across different scenarios, especially in unstructured environments with cluttered obstacles and diverse layouts, by presenting RSPT, a framework that reconstructs surroundings and predicts target trajectories to form a structure-aware motion representation, and shows it outperforms existing methods in unseen environments and transfers successfully to real-world settings.
Active Object Tracking (AOT) aims to maintain a specific relation between the tracker and object(s) by autonomously controlling the motion system of a tracker given observations. AOT has wide-ranging applications, such as in mobile robots and autonomous driving. However, building a generalizable active tracker that works robustly across different scenarios remains a challenge, especially in unstructured environments with cluttered obstacles and diverse layouts. We argue that constructing a state representation capable of modeling the geometry structure of the surroundings and the dynamics of the target is crucial for achieving this goal. To address this challenge, we present RSPT, a framework that forms a structure-aware motion representation by Reconstructing the Surroundings and Predicting the target Trajectory. Additionally, we enhance the generalization of the policy network by training in an asymmetric dueling mechanism. We evaluate RSPT on various simulated scenarios and show that it outperforms existing methods in unseen environments, particularly those with complex obstacles and layouts. We also demonstrate the successful transfer of RSPT to real-world settings. Project Website: https://sites.google.com/view/aot-rspt.