CVAILGAug 6, 2025

Extending Foundational Monocular Depth Estimators to Fisheye Cameras with Calibration Tokens

arXiv:2508.04928v38 citationsh-index: 7Has Code
Originality Incremental advance
AI Analysis

This work solves the issue of erroneous depth estimation for fisheye cameras in robotics and autonomous systems, offering a lightweight adaptation method that is incremental in nature.

The paper tackles the problem of adapting foundational monocular depth estimators, trained on perspective images, to fisheye cameras by addressing covariate shifts from calibration differences, resulting in improved depth estimates without retraining. It introduces Calibration Tokens to align latent embeddings, achieving consistent gains over state-of-the-art methods on indoor and outdoor datasets.

We propose a method to extend foundational monocular depth estimators (FMDEs), trained on perspective images, to fisheye images. Despite being trained on tens of millions of images, FMDEs are susceptible to the covariate shift introduced by changes in camera calibration (intrinsic, distortion) parameters, leading to erroneous depth estimates. Our method aligns the distribution of latent embeddings encoding fisheye images to those of perspective images, enabling the reuse of FMDEs for fisheye cameras without retraining or finetuning. To this end, we introduce a set of Calibration Tokens as a light-weight adaptation mechanism that modulates the latent embeddings for alignment. By exploiting the already expressive latent space of FMDEs, we posit that modulating their embeddings avoids the negative impact of artifacts and loss introduced in conventional recalibration or map projection to a canonical reference frame in the image space. Our method is self-supervised and does not require fisheye images but leverages publicly available large-scale perspective image datasets. This is done by recalibrating perspective images to fisheye images, and enforcing consistency between their estimates during training. We evaluate our approach with several FMDEs, on both indoors and outdoors, where we consistently improve over state-of-the-art methods using a single set of tokens for both. Code available at: https://github.com/JungHeeKim29/calibration-token.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes