CVApr 12, 2022

Localization Distillation for Object Detection

Zhaohui Zheng, Rongguang Ye, Qibin Hou, Dongwei Ren, Ping Wang, Wangmeng Zuo, Ming-Ming Cheng

arXiv:2204.05957v216.788 citationsh-index: 103Has Code

Originality Highly original

AI Analysis

This work addresses a critical bottleneck in object detection by enhancing knowledge distillation for better localization, which is incremental but impactful for improving detector accuracy and efficiency.

The paper tackles the inefficiency of logit mimicking in knowledge distillation for object detection by proposing a localization distillation method and valuable localization region concept, achieving significant AP improvements on benchmarks like MS COCO, PASCAL VOC, and DOTA without sacrificing inference speed.

Previous knowledge distillation (KD) methods for object detection mostly focus on feature imitation instead of mimicking the prediction logits due to its inefficiency in distilling the localization information. In this paper, we investigate whether logit mimicking always lags behind feature imitation. Towards this goal, we first present a novel localization distillation (LD) method which can efficiently transfer the localization knowledge from the teacher to the student. Second, we introduce the concept of valuable localization region that can aid to selectively distill the classification and localization knowledge for a certain region. Combining these two new components, for the first time, we show that logit mimicking can outperform feature imitation and the absence of localization distillation is a critical reason for why logit mimicking underperforms for years. The thorough studies exhibit the great potential of logit mimicking that can significantly alleviate the localization ambiguity, learn robust feature representation, and ease the training difficulty in the early stage. We also provide the theoretical connection between the proposed LD and the classification KD, that they share the equivalent optimization effect. Our distillation scheme is simple as well as effective and can be easily applied to both dense horizontal object detectors and rotated object detectors. Extensive experiments on the MS COCO, PASCAL VOC, and DOTA benchmarks demonstrate that our method can achieve considerable AP improvement without any sacrifice on the inference speed. Our source code and pretrained models are publicly available at https://github.com/HikariTJU/LD.

View on arXiv PDF Code

Similar