CVMay 14, 2025

Using Cross-Domain Detection Loss to Infer Multi-Scale Information for Improved Tiny Head Tracking

Jisu Kim, Alex Mattingly, Eung-Joo Lee, Benjamin S. Riggan

arXiv:2505.22677v13.6h-index: 16FG

Originality Incremental advance

AI Analysis

This work addresses computational inefficiencies in head detection and tracking for crowded scenes, but it is incremental as it builds on existing methods with specific optimizations.

The paper tackled the problem of head detection and tracking in crowded scenes by proposing a framework that balances performance and efficiency, resulting in improved MOTA and mAP on CroHD and CrowdHuman datasets.

Head detection and tracking are essential for downstream tasks, but current methods often require large computational budgets, which increase latencies and ties up resources (e.g., processors, memory, and bandwidth). To address this, we propose a framework to enhance tiny head detection and tracking by optimizing the balance between performance and efficiency. Our framework integrates (1) a cross-domain detection loss, (2) a multi-scale module, and (3) a small receptive field detection mechanism. These innovations enhance detection by bridging the gap between large and small detectors, capturing high-frequency details at multiple scales during training, and using filters with small receptive fields to detect tiny heads. Evaluations on the CroHD and CrowdHuman datasets show improved Multiple Object Tracking Accuracy (MOTA) and mean Average Precision (mAP), demonstrating the effectiveness of our approach in crowded scenes.

View on arXiv PDF

Similar