Revisiting Fully Convolutional Geometric Features for Object 6D Pose Estimation
This work addresses object pose estimation for robotics and AR/VR applications, presenting an incremental improvement by tailoring an existing method with modifications.
The paper tackles the problem of 6D object pose estimation by revisiting Fully Convolutional Geometric Features (FCGF) to learn point-level discriminative features, achieving state-of-the-art performance on popular benchmarks.
Recent works on 6D object pose estimation focus on learning keypoint correspondences between images and object models, and then determine the object pose through RANSAC-based algorithms or by directly regressing the pose with end-to-end optimisations. We argue that learning point-level discriminative features is overlooked in the literature. To this end, we revisit Fully Convolutional Geometric Features (FCGF) and tailor it for object 6D pose estimation to achieve state-of-the-art performance. FCGF employs sparse convolutions and learns point-level features using a fully-convolutional network by optimising a hardest contrastive loss. We can outperform recent competitors on popular benchmarks by adopting key modifications to the loss and to the input data representations, by carefully tuning the training strategies, and by employing data augmentations suitable for the underlying problem. We carry out a thorough ablation to study the contribution of each modification. The code is available at https://github.com/jcorsetti/FCGF6D.