Distilling Image Classifiers in Object Detectors
This addresses the challenge of transferring knowledge across tasks in computer vision, offering a novel approach for enhancing object detectors, though it is incremental in the broader context of distillation techniques.
The paper tackles the problem of knowledge distillation across different tasks by introducing a classifier-to-detector framework, improving both recognition accuracy and localization performance in object detection, and it outperforms state-of-the-art detector-to-detector methods in experiments.
Knowledge distillation constitutes a simple yet effective way to improve the performance of a compact student network by exploiting the knowledge of a more powerful teacher. Nevertheless, the knowledge distillation literature remains limited to the scenario where the student and the teacher tackle the same task. Here, we investigate the problem of transferring knowledge not only across architectures but also across tasks. To this end, we study the case of object detection and, instead of following the standard detector-to-detector distillation approach, introduce a classifier-to-detector knowledge transfer framework. In particular, we propose strategies to exploit the classification teacher to improve both the detector's recognition accuracy and localization performance. Our experiments on several detectors with different backbones demonstrate the effectiveness of our approach, allowing us to outperform the state-of-the-art detector-to-detector distillation methods.