CATRE: Iterative Point Clouds Alignment for Category-level Object Pose Refinement
This addresses the challenge of accurate object pose estimation in robotics and AR/VR, offering a novel refinement approach that improves over existing methods.
The paper tackles the problem of category-level 9DoF object pose estimation by introducing CATRE, an iterative refiner that enhances pose accuracy from point clouds, achieving state-of-the-art results on benchmarks like REAL275 and CAMERA25 with speeds up to ~85.32Hz.
While category-level 9DoF object pose estimation has emerged recently, previous correspondence-based or direct regression methods are both limited in accuracy due to the huge intra-category variances in object shape and color, etc. Orthogonal to them, this work presents a category-level object pose and size refiner CATRE, which is able to iteratively enhance pose estimate from point clouds to produce accurate results. Given an initial pose estimate, CATRE predicts a relative transformation between the initial pose and ground truth by means of aligning the partially observed point cloud and an abstract shape prior. In specific, we propose a novel disentangled architecture being aware of the inherent distinctions between rotation and translation/size estimation. Extensive experiments show that our approach remarkably outperforms state-of-the-art methods on REAL275, CAMERA25, and LM benchmarks up to a speed of ~85.32Hz, and achieves competitive results on category-level tracking. We further demonstrate that CATRE can perform pose refinement on unseen category. Code and trained models are available.