CV AI LGMay 30, 2022

Towards Efficient 3D Object Detection with Knowledge Distillation

Jihan Yang, Shaoshuai Shi, Runyu Ding, Zhe Wang, Xiaojuan Qi

arXiv:2205.15156v317.967 citationsh-index: 58Has Code

Originality Incremental advance

AI Analysis

This work addresses efficiency issues in 3D object detection for applications like autonomous driving, but it is incremental as it adapts existing knowledge distillation methods to a new domain.

The paper tackles the problem of high computational overhead in advanced 3D object detectors by applying knowledge distillation to develop efficient models, achieving a best model with 65.75% LEVEL 2 mAPH on the Waymo dataset while using only 44% of the teacher's flops and a most efficient model running at 51 FPS, which is 2.2x faster than PointPillar with higher accuracy.

Despite substantial progress in 3D object detection, advanced 3D detectors often suffer from heavy computation overheads. To this end, we explore the potential of knowledge distillation (KD) for developing efficient 3D object detectors, focusing on popular pillar- and voxel-based detectors.In the absence of well-developed teacher-student pairs, we first study how to obtain student models with good trade offs between accuracy and efficiency from the perspectives of model compression and input resolution reduction. Then, we build a benchmark to assess existing KD methods developed in the 2D domain for 3D object detection upon six well-constructed teacher-student pairs. Further, we propose an improved KD pipeline incorporating an enhanced logit KD method that performs KD on only a few pivotal positions determined by teacher classification response, and a teacher-guided student model initialization to facilitate transferring teacher model's feature extraction ability to students through weight inheritance. Finally, we conduct extensive experiments on the Waymo dataset. Our best performing model achieves $65.75\%$ LEVEL 2 mAPH, surpassing its teacher model and requiring only $44\%$ of teacher flops. Our most efficient model runs 51 FPS on an NVIDIA A100, which is $2.2\times$ faster than PointPillar with even higher accuracy. Code is available at \url{https://github.com/CVMI-Lab/SparseKD}.

View on arXiv PDF Code

Similar