CVAug 30, 2024

Hybrid Classification-Regression Adaptive Loss for Dense Object Detection

Yanquan Huang, Liu Wei Zhen, Yun Hao, Mengyuan Zhang, Qingyao Wu, Zikun Deng, Xueming Liu, Hong Deng

arXiv:2408.17182v12.02 citationsh-index: 5

Originality Incremental advance

AI Analysis

This work addresses a specific bottleneck in object detection models, offering an incremental improvement for computer vision applications.

The paper tackles the problem of inconsistent classification and regression tasks in dense object detection by proposing a Hybrid Classification-Regression Adaptive Loss (HCRAL) with modules for cross-task supervision and focusing on difficult samples, achieving improved performance on COCO test-dev.

For object detection detectors, enhancing model performance hinges on the ability to simultaneously consider inconsistencies across tasks and focus on difficult-to-train samples. Achieving this necessitates incorporating information from both the classification and regression tasks. However, prior work tends to either emphasize difficult-to-train samples within their respective tasks or simply compute classification scores with IoU, often leading to suboptimal model performance. In this paper, we propose a Hybrid Classification-Regression Adaptive Loss, termed as HCRAL. Specifically, we introduce the Residual of Classification and IoU (RCI) module for cross-task supervision, addressing task inconsistencies, and the Conditioning Factor (CF) to focus on difficult-to-train samples within each task. Furthermore, we introduce a new strategy named Expanded Adaptive Training Sample Selection (EATSS) to provide additional samples that exhibit classification and regression inconsistencies. To validate the effectiveness of the proposed method, we conduct extensive experiments on COCO test-dev. Experimental evaluations demonstrate the superiority of our approachs. Additionally, we designed experiments by separately combining the classification and regression loss with regular loss functions in popular one-stage models, demonstrating improved performance.

View on arXiv PDF

Similar