CVLGMay 30, 2022

Knowledge Distillation for 6D Pose Estimation by Aligning Distributions of Local Predictions

arXiv:2205.14971v219 citationsh-index: 67
AI Analysis

This work addresses the challenge of training efficient models for 6D pose estimation, a domain-specific task in computer vision, by introducing a novel distillation approach.

The paper tackles the problem of applying knowledge distillation to 6D pose estimation, which was previously unstudied, by proposing a method that aligns distributions of local predictions instead of direct supervision, resulting in state-of-the-art performance on benchmarks with compact student models.

Knowledge distillation facilitates the training of a compact student network by using a deep teacher one. While this has achieved great success in many tasks, it remains completely unstudied for image-based 6D object pose estimation. In this work, we introduce the first knowledge distillation method driven by the 6D pose estimation task. To this end, we observe that most modern 6D pose estimation frameworks output local predictions, such as sparse 2D keypoints or dense representations, and that the compact student network typically struggles to predict such local quantities precisely. Therefore, instead of imposing prediction-to-prediction supervision from the teacher to the student, we propose to distill the teacher's \emph{distribution} of local predictions into the student network, facilitating its training. Our experiments on several benchmarks show that our distillation method yields state-of-the-art results with different compact student models and for both keypoint-based and dense prediction-based architectures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes