CVMay 22, 2022

Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey

arXiv:2205.10766v224 citationsh-index: 60
Originality Synthesis-oriented
AI Analysis

It addresses the need for organized knowledge in MOT for researchers and practitioners, but it is incremental as it synthesizes existing work rather than introducing new methods.

This survey tackles the lack of systematic analysis of embedding methods in multi-object tracking (MOT) by providing a comprehensive overview from seven perspectives, summarizing datasets and state-of-the-art methods, and discussing future research directions.

Multi-object tracking (MOT) aims to associate target objects across video frames in order to obtain entire moving trajectories. With the advancement of deep neural networks and the increasing demand for intelligent video analysis, MOT has gained significantly increased interest in the computer vision community. Embedding methods play an essential role in object location estimation and temporal identity association in MOT. Unlike other computer vision tasks, such as image classification, object detection, re-identification, and segmentation, embedding methods in MOT have large variations, and they have never been systematically analyzed and summarized. In this survey, we first conduct a comprehensive overview with in-depth analysis for embedding methods in MOT from seven different perspectives, including patch-level embedding, single-frame embedding, cross-frame joint embedding, correlation embedding, sequential embedding, tracklet embedding, and cross-track relational embedding. We further summarize the existing widely used MOT datasets and analyze the advantages of existing state-of-the-art methods according to their embedding strategies. Finally, some critical yet under-investigated areas and future research directions are discussed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes