Learning a Neural Solver for Multiple Object Tracking
This addresses the challenge of applying learning methods to structured domains in MOT, moving beyond feature extraction to data association, though it is incremental as it builds on classical network flow formulations.
The paper tackles the problem of learning data association in Multiple Object Tracking (MOT) by proposing a fully differentiable framework based on Message Passing Networks that operates directly on graphs, showing significant improvements in MOTA and IDF1 on three benchmarks.
Graphs offer a natural way to formulate Multiple Object Tracking (MOT) within the tracking-by-detection paradigm. However, they also introduce a major challenge for learning methods, as defining a model that can operate on such \textit{structured domain} is not trivial. As a consequence, most learning-based work has been devoted to learning better features for MOT, and then using these with well-established optimization frameworks. In this work, we exploit the classical network flow formulation of MOT to define a fully differentiable framework based on Message Passing Networks (MPNs). By operating directly on the graph domain, our method can reason globally over an entire set of detections and predict final solutions. Hence, we show that learning in MOT does not need to be restricted to feature extraction, but it can also be applied to the data association step. We show a significant improvement in both MOTA and IDF1 on three publicly available benchmarks. Our code is available at https://bit.ly/motsolv .