CVAIDec 14, 2024

Heterogeneous Graph Transformer for Multiple Tiny Object Tracking in RGB-T Videos

arXiv:2412.10861v129 citationsh-index: 36Has CodeIEEE transactions on multimedia
Originality Incremental advance
AI Analysis

This addresses the challenge of tracking tiny objects with weak appearance for applications like remote sensing, though it is incremental as it builds on existing multi-object tracking with new modality integration.

The paper tackles the problem of tracking multiple tiny objects in RGB-T videos by integrating complementary information from multiple modalities, achieving better performance in MOTA and ID-F1 scores compared to state-of-the-art methods.

Tracking multiple tiny objects is highly challenging due to their weak appearance and limited features. Existing multi-object tracking algorithms generally focus on single-modality scenes, and overlook the complementary characteristics of tiny objects captured by multiple remote sensors. To enhance tracking performance by integrating complementary information from multiple sources, we propose a novel framework called {HGT-Track (Heterogeneous Graph Transformer based Multi-Tiny-Object Tracking)}. Specifically, we first employ a Transformer-based encoder to embed images from different modalities. Subsequently, we utilize Heterogeneous Graph Transformer to aggregate spatial and temporal information from multiple modalities to generate detection and tracking features. Additionally, we introduce a target re-detection module (ReDet) to ensure tracklet continuity by maintaining consistency across different modalities. Furthermore, this paper introduces the first benchmark VT-Tiny-MOT (Visible-Thermal Tiny Multi-Object Tracking) for RGB-T fused multiple tiny object tracking. Extensive experiments are conducted on VT-Tiny-MOT, and the results have demonstrated the effectiveness of our method. Compared to other state-of-the-art methods, our method achieves better performance in terms of MOTA (Multiple-Object Tracking Accuracy) and ID-F1 score. The code and dataset will be made available at https://github.com/xuqingyu26/HGTMT.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes