Extreme Rotation Estimation in the Wild
This addresses the challenge of extreme rotation estimation for computer vision applications in diverse, unconstrained environments, representing an incremental improvement over prior constrained methods.
The paper tackles the problem of estimating relative 3D orientation between Internet images with limited or non-overlapping views in real-world settings, and the result is a Transformer-based method that outperforms baselines on a new benchmark dataset.
We present a technique and benchmark dataset for estimating the relative 3D orientation between a pair of Internet images captured in an extreme setting, where the images have limited or non-overlapping field of views. Prior work targeting extreme rotation estimation assume constrained 3D environments and emulate perspective images by cropping regions from panoramic views. However, real images captured in the wild are highly diverse, exhibiting variation in both appearance and camera intrinsics. In this work, we propose a Transformer-based method for estimating relative rotations in extreme real-world settings, and contribute the ExtremeLandmarkPairs dataset, assembled from scene-level Internet photo collections. Our evaluation demonstrates that our approach succeeds in estimating the relative rotations in a wide variety of extreme-view Internet image pairs, outperforming various baselines, including dedicated rotation estimation techniques and contemporary 3D reconstruction methods.