CVMar 6

RePer-360: Releasing Perspective Priors for 360$^\circ$ Depth Estimation via Self-Modulation

arXiv:2603.05999v1h-index: 1
Predicted impact top 52% in CV · last 90 daysOriginality Incremental advance
AI Analysis

This addresses a domain-specific challenge in panoramic depth estimation for applications like VR or robotics, offering an incremental but efficient adaptation method.

The paper tackles the problem of poor generalization of depth foundation models to 360° images due to geometric discrepancies, proposing RePer-360, which adapts these models while preserving perspective priors, achieving a 20% improvement in RMSE with only 1% of training data.

Recent depth foundation models trained on perspective imagery achieve strong performance, yet generalize poorly to 360$^\circ$ images due to the substantial geometric discrepancy between perspective and panoramic domains. Moreover, fully fine-tuning these models typically requires large amounts of panoramic data. To address this issue, we propose RePer-360, a distortion-aware self-modulation framework for monocular panoramic depth estimation that adapts depth foundation models while preserving powerful pretrained perspective priors. Specifically, we design a lightweight geometry-aligned guidance module to derive a modulation signal from two complementary projections (i.e., ERP and CP) and use it to guide the model toward the panoramic domain without overwriting its pretrained perspective knowledge. We further introduce a Self-Conditioned AdaLN-Zero mechanism that produces pixel-wise scaling factors to reduce the feature distribution gap between the perspective and panoramic domains. In addition, a cubemap-domain consistency loss further improves training stability and cross-projection alignment. By shifting the focus from complementary-projection fusion to panoramic domain adaptation under preserved pretrained perspective priors, RePer-360 surpasses standard fine-tuning methods while using only 1\% of the training data. Under the same in-domain training setting, it further achieves an approximately 20\% improvement in RMSE. Code will be released upon acceptance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes