CVMar 19, 2025

Distilling 3D distinctive local descriptors for 6D pose estimation

arXiv:2503.15106v32 citationsh-index: 15IROS
Originality Incremental advance
AI Analysis

This work addresses the efficiency bottleneck for real-world 6D pose estimation applications, making zero-shot methods more practical, though it is incremental as it builds on GeDi.

The paper tackled the problem of GeDi's computationally expensive inference for 6D pose estimation by introducing a knowledge distillation framework to train an efficient student model, resulting in a significant reduction in inference time while maintaining competitive performance on five BOP Benchmark datasets.

Three-dimensional local descriptors are crucial for encoding geometric surface properties, making them essential for various point cloud understanding tasks. Among these descriptors, GeDi has demonstrated strong zero-shot 6D pose estimation capabilities but remains computationally impractical for real-world applications due to its expensive inference process. Can we retain GeDi's effectiveness while significantly improving its efficiency? In this paper, we explore this question by introducing a knowledge distillation framework that trains an efficient student model to regress local descriptors from a GeDi teacher. Our key contributions include: an efficient large-scale training procedure that ensures robustness to occlusions and partial observations while operating under compute and storage constraints, and a novel loss formulation that handles weak supervision from non-distinctive teacher descriptors. We validate our approach on five BOP Benchmark datasets and demonstrate a significant reduction in inference time while maintaining competitive performance with existing methods, bringing zero-shot 6D pose estimation closer to real-time feasibility. Project Website: https://tev-fbk.github.io/dGeDi/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes