CVLGMLJul 18, 2019

Deep Learning in Video Multi-Object Tracking: A Survey

arXiv:1907.12740v4658 citations
Originality Synthesis-oriented
AI Analysis

It synthesizes existing research for practitioners in computer vision, but is incremental as it does not introduce new methods.

This paper surveys deep learning methods for multiple object tracking in single-camera videos, reviewing their application across four main steps and providing an experimental comparison on MOTChallenge datasets to identify similarities in top-performing approaches.

The problem of Multiple Object Tracking (MOT) consists in following the trajectory of different objects in a sequence, usually a video. In recent years, with the rise of Deep Learning, the algorithms that provide a solution to this problem have benefited from the representational power of deep models. This paper provides a comprehensive survey on works that employ Deep Learning models to solve the task of MOT on single-camera videos. Four main steps in MOT algorithms are identified, and an in-depth review of how Deep Learning was employed in each one of these stages is presented. A complete experimental comparison of the presented works on the three MOTChallenge datasets is also provided, identifying a number of similarities among the top-performing methods and presenting some possible future research directions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes