Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion
This addresses a fundamental limitation in computer vision and robotics by extending relative pose estimation to extreme cases, which is incremental but impactful for applications requiring robust alignment in sparse data scenarios.
The paper tackles the problem of estimating relative pose between RGB-D scans with little or no overlap by introducing a deep neural network that alternates between scene completion and pose estimation, resulting in considerable improvements over state-of-the-art methods and enabling pose estimates even for non-overlapping scans.
Estimating the relative rigid pose between two RGB-D scans of the same underlying environment is a fundamental problem in computer vision, robotics, and computer graphics. Most existing approaches allow only limited maximum relative pose changes since they require considerable overlap between the input scans. We introduce a novel deep neural network that extends the scope to extreme relative poses, with little or even no overlap between the input scans. The key idea is to infer more complete scene information about the underlying environment and match on the completed scans. In particular, instead of only performing scene completion from each individual scan, our approach alternates between relative pose estimation and scene completion. This allows us to perform scene completion by utilizing information from both input scans at late iterations, resulting in better results for both scene completion and relative pose estimation. Experimental results on benchmark datasets show that our approach leads to considerable improvements over state-of-the-art approaches for relative pose estimation. In particular, our approach provides encouraging relative pose estimates even between non-overlapping scans.