Object Tracking by Reconstruction with View-Specific Discriminative Correlation Filters
This work addresses robust object tracking in dynamic 3D environments, such as robotics or surveillance, by improving handling of rotations and occlusions, though it is incremental as it builds on existing DCF methods.
The paper tackles the challenge of tracking objects in RGB-D videos, where standard 2D methods struggle with appearance changes from out-of-plane rotation, by proposing OTR, a long-term tracker that uses online 3D reconstruction and view-specific discriminative correlation filters, achieving state-of-the-art performance on benchmarks like Princeton RGB-D tracking and STC.
Standard RGB-D trackers treat the target as an inherently 2D structure, which makes modelling appearance changes related even to simple out-of-plane rotation highly challenging. We address this limitation by proposing a novel long-term RGB-D tracker - Object Tracking by Reconstruction (OTR). The tracker performs online 3D target reconstruction to facilitate robust learning of a set of view-specific discriminative correlation filters (DCFs). The 3D reconstruction supports two performance-enhancing features: (i) generation of accurate spatial support for constrained DCF learning from its 2D projection and (ii) point cloud based estimation of 3D pose change for selection and storage of view-specific DCFs which are used to robustly localize the target after out-of-view rotation or heavy occlusion. Extensive evaluation of OTR on the challenging Princeton RGB-D tracking and STC Benchmarks shows it outperforms the state-of-the-art by a large margin.