CVApr 25, 2022

Single Object Tracking Research: A Survey

arXiv:2204.11410v112 citationsh-index: 15
Originality Synthesis-oriented
AI Analysis

It provides a comprehensive overview for researchers in computer vision, but is incremental as it synthesizes existing work without new results.

This survey paper reviews the development of visual object tracking algorithms, focusing on correlation filter and Siamese network frameworks, and discusses benchmarks, challenges, and future trends such as multimodal integration.

Visual object tracking is an important task in computer vision, which has many real-world applications, e.g., video surveillance, visual navigation. Visual object tracking also has many challenges, e.g., object occlusion and deformation. To solve above problems and track the target accurately and efficiently, many tracking algorithms have emerged in recent years. This paper presents the rationale and representative works of two most popular tracking frameworks in past ten years, i.e., the corelation filter and Siamese network for object tracking. Then we present some deep learning based tracking methods categorized by different network structures. We also introduce some classical strategies for handling the challenges in tracking problem. Further, this paper detailedly present and compare the benchmarks and challenges for tracking, from which we summarize the development history and development trend of visual tracking. Focusing on the future development of object tracking, which we think would be applied in real-world scenes before some problems to be addressed, such as the problems in long-term tracking, low-power high-speed tracking and attack-robust tracking. In the future, the integration of multimodal data, e.g., the depth image, thermal image with traditional color image, will provide more solutions for visual tracking. Moreover, tracking task will go together with some other tasks, e.g., video object detection and segmentation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes