Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation
This provides a dataset and adaptations for omnidirectional depth estimation, addressing a gap in computer vision, but it is incremental as it builds on existing stereo methods.
The authors tackled the lack of data for omnidirectional stereo depth estimation by introducing Helvipad, a real-world dataset with 40K video frames and accurate depth labels, and showed that adapting stereo models improves performance in this domain.
Despite progress in stereo depth estimation, omnidirectional imaging remains underexplored, mainly due to the lack of appropriate data. We introduce Helvipad, a real-world dataset for omnidirectional stereo depth estimation, featuring 40K video frames from video sequences across diverse environments, including crowded indoor and outdoor scenes with various lighting conditions. Collected using two 360° cameras in a top-bottom setup and a LiDAR sensor, the dataset includes accurate depth and disparity labels by projecting 3D point clouds onto equirectangular images. Additionally, we provide an augmented training set with an increased label density by using depth completion. We benchmark leading stereo depth estimation models for both standard and omnidirectional images. The results show that while recent stereo methods perform decently, a challenge persists in accurately estimating depth in omnidirectional imaging. To address this, we introduce necessary adaptations to stereo models, leading to improved performance.