CVFeb 20, 2023

Unsupervised OmniMVS: Efficient Omnidirectional Depth Inference via Establishing Pseudo-Stereo Supervision

Zisong Chen, Chunyu Lin, Lang Nie, Kang Liao, Yao Zhao

arXiv:2302.09922v28.412 citationsh-index: 18

Originality Highly original

AI Analysis

This addresses the impracticality of supervised omnidirectional MVS for real-world applications by eliminating the need for costly depth labels.

The paper tackles the problem of expensive dense depth labels in omnidirectional multi-view stereo (MVS) by proposing the first unsupervised framework using multiple fisheye images, achieving competitive performance to state-of-the-art supervised methods with better generalization in real-world data.

Omnidirectional multi-view stereo (MVS) vision is attractive for its ultra-wide field-of-view (FoV), enabling machines to perceive 360° 3D surroundings. However, the existing solutions require expensive dense depth labels for supervision, making them impractical in real-world applications. In this paper, we propose the first unsupervised omnidirectional MVS framework based on multiple fisheye images. To this end, we project all images to a virtual view center and composite two panoramic images with spherical geometry from two pairs of back-to-back fisheye images. The two 360° images formulate a stereo pair with a special pose, and the photometric consistency is leveraged to establish the unsupervised constraint, which we term "Pseudo-Stereo Supervision". In addition, we propose Un-OmniMVS, an efficient unsupervised omnidirectional MVS network, to facilitate the inference speed with two efficient components. First, a novel feature extractor with frequency attention is proposed to simultaneously capture the non-local Fourier features and local spatial features, explicitly facilitating the feature representation. Then, a variance-based light cost volume is put forward to reduce the computational complexity. Experiments exhibit that the performance of our unsupervised solution is competitive to that of the state-of-the-art (SoTA) supervised methods with better generalization in real-world data.

View on arXiv PDF

Similar