CV LG ROSep 21, 2021

KDFNet: Learning Keypoint Distance Field for 6D Object Pose Estimation

arXiv:2109.10127v18.018 citations

Originality Highly original

AI Analysis

This addresses robust pose estimation for robotics and AR/VR applications, offering a specific improvement over existing methods for challenging cases like occlusion.

The paper tackles the problem of 6D object pose estimation from RGB images, particularly for handling occlusion and long/thin objects, by proposing a novel Keypoint Distance Field (KDF) representation and distance-based voting scheme, achieving state-of-the-art performance with 50.3% average ADD(-S) accuracy on Occlusion LINEMOD and 75.72% on TOD mug subset.

We present KDFNet, a novel method for 6D object pose estimation from RGB images. To handle occlusion, many recent works have proposed to localize 2D keypoints through pixel-wise voting and solve a Perspective-n-Point (PnP) problem for pose estimation, which achieves leading performance. However, such voting process is direction-based and cannot handle long and thin objects where the direction intersections cannot be robustly found. To address this problem, we propose a novel continuous representation called Keypoint Distance Field (KDF) for projected 2D keypoint locations. Formulated as a 2D array, each element of the KDF stores the 2D Euclidean distance between the corresponding image pixel and a specified projected 2D keypoint. We use a fully convolutional neural network to regress the KDF for each keypoint. Using this KDF encoding of projected object keypoint locations, we propose to use a distance-based voting scheme to localize the keypoints by calculating circle intersections in a RANSAC fashion. We validate the design choices of our framework by extensive ablation experiments. Our proposed method achieves state-of-the-art performance on Occlusion LINEMOD dataset with an average ADD(-S) accuracy of 50.3% and TOD dataset mug subset with an average ADD accuracy of 75.72%. Extensive experiments and visualizations demonstrate that the proposed method is able to robustly estimate the 6D pose in challenging scenarios including occlusion.

View on arXiv PDF

Similar