CVJul 1, 2024

GMT: Effective Global Framework for Multi-Camera Multi-Target Tracking

arXiv:2407.01007v24 citationsh-index: 19
AI Analysis

It addresses tracking across multiple camera views for surveillance or security applications, offering a novel approach beyond incremental improvements.

The paper tackles the problem of multi-camera multi-target tracking by proposing GMT, a global framework that jointly uses intra-view and inter-view cues, achieving gains of up to 21.3% in CVMA and 17.2% in CVIDF1 on existing datasets.

Multi-Camera Multi-Target (MCMT) tracking aims to locate and associate the same targets across multiple camera views. Existing methods typically adopt a two-stage framework, involving single-camera tracking followed by inter-camera tracking. However, in this paradigm, multi-view information is used only to recover missed matches in the first stage, providing a limited contribution to overall tracking. To address this issue, we propose GMT, a global MCMT tracking framework that jointly exploits intra-view and inter-view cues for tracking. Specifically, instead of assigning trajectories independently for each view, we integrate the same historical targets across different views as global trajectories, thereby reformulating the two-stage tracking as a unified global-level trajectory-target association process. We introduce a Cross-View Feature Consistency Enhancement (CFCE) module to align visual and spatial features across views, providing a consistent feature space for global trajectory modeling. With these aligned features, the Global Trajectory Association (GTA) module associates new detections with existing global trajectories, enabling direct use of multi-view information. Compared to the two-stage framework, GMT achieves significant improvements on existing datasets, with gains of up to 21.3 percent in CVMA and 17.2 percent in CVIDF1. Furthermore, we introduce VisionTrack, a high-quality, large-scale MCMT dataset providing significantly greater diversity than existing datasets. Our code and dataset will be released.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes