NOOUGAT: Towards Unified Online and Offline Multi-Object Tracking
It addresses the flexible temporal requirements for real-world deployment in multi-object tracking, offering a novel solution that bridges the gap between online and offline methods.
The paper tackles the problem of fragmented online and offline multi-object tracking by introducing NOOUGAT, a unified tracker that operates with arbitrary temporal horizons, achieving state-of-the-art performance with improvements such as +2.3 AssA on DanceTrack and +9.2 on SportsMOT in online mode.
The long-standing division between \textit{online} and \textit{offline} Multi-Object Tracking (MOT) has led to fragmented solutions that fail to address the flexible temporal requirements of real-world deployment scenarios. Current \textit{online} trackers rely on frame-by-frame hand-crafted association strategies and struggle with long-term occlusions, whereas \textit{offline} approaches can cover larger time gaps, but still rely on heuristic stitching for arbitrarily long sequences. In this paper, we introduce NOOUGAT, the first tracker designed to operate with arbitrary temporal horizons. NOOUGAT leverages a unified Graph Neural Network (GNN) framework that processes non-overlapping subclips, and fuses them through a novel Autoregressive Long-term Tracking (ALT) layer. The subclip size controls the trade-off between latency and temporal context, enabling a wide range of deployment scenarios, from frame-by-frame to batch processing. NOOUGAT achieves state-of-the-art performance across both tracking regimes, improving \textit{online} AssA by +2.3 on DanceTrack, +9.2 on SportsMOT, and +5.0 on MOT20, with even greater gains in \textit{offline} mode.