CVDec 14, 2023

Multi-Scene Generalized Trajectory Global Graph Solver with Composite Nodes for Multiple Object Tracking

arXiv:2312.08951v118 citationsh-index: 49AAAI
Originality Incremental advance
AI Analysis

This addresses storage and accuracy issues in multi-object tracking for video analysis applications, representing an incremental improvement over existing graph-based methods.

The paper tackles the storage and optimization challenges in global multi-object tracking for long videos by proposing CoNo-Link, a low-storage graph-based framework that treats object trajectories as nodes, achieving state-of-the-art performance on multiple datasets.

The global multi-object tracking (MOT) system can consider interaction, occlusion, and other ``visual blur'' scenarios to ensure effective object tracking in long videos. Among them, graph-based tracking-by-detection paradigms achieve surprising performance. However, their fully-connected nature poses storage space requirements that challenge algorithm handling long videos. Currently, commonly used methods are still generated trajectories by building one-forward associations across frames. Such matches produced under the guidance of first-order similarity information may not be optimal from a longer-time perspective. Moreover, they often lack an end-to-end scheme for correcting mismatches. This paper proposes the Composite Node Message Passing Network (CoNo-Link), a multi-scene generalized framework for modeling ultra-long frames information for association. CoNo-Link's solution is a low-storage overhead method for building constrained connected graphs. In addition to the previous method of treating objects as nodes, the network innovatively treats object trajectories as nodes for information interaction, improving the graph neural network's feature representation capability. Specifically, we formulate the graph-building problem as a top-k selection task for some reliable objects or trajectories. Our model can learn better predictions on longer-time scales by adding composite nodes. As a result, our method outperforms the state-of-the-art in several commonly used datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes