Is First Person Vision Challenging for Object Tracking?
This work addresses the problem of evaluating object tracking performance in First Person Vision for researchers and developers working on human-object interaction modeling.
This paper systematically analyzes the performance of state-of-the-art visual trackers in First Person Vision (FPV) using the TREK-150 benchmark dataset. The study found that current tracking algorithms are not yet robust enough for FPV tasks, indicating a need for more research.
Understanding human-object interactions is fundamental in First Person Vision (FPV). Tracking algorithms which follow the objects manipulated by the camera wearer can provide useful cues to effectively model such interactions. Despite a few previous attempts to exploit trackers in FPV applications, a methodical analysis of the performance of state-of-the-art visual trackers in this domain is still missing. In this short paper, we provide a recap of the first systematic study of object tracking in FPV. Our work extensively analyses the performance of recent and baseline FPV trackers with respect to different aspects. This is achieved through TREK-150, a novel benchmark dataset composed of 150 densely annotated video sequences. The results suggest that more research efforts should be devoted to this problem so that tracking could benefit FPV tasks. The full version of this paper is available at arXiv:2108.13665.