CV AIDec 18, 2024

ConDo: Continual Domain Expansion for Absolute Pose Regression

Zijun Li, Zhipeng Cai, Bochun Yang, Xuelun Shen, Siqi Shen, Xiaoliang Fan, Michael Paulitsch, Cheng Wang

arXiv:2412.13452v15.22 citationsh-index: 8Has CodeAAAI

Originality Incremental advance

AI Analysis

This addresses the issue of visual localization in dynamic real-world applications like robotics or autonomous vehicles, where environments change over time, and is a significant incremental improvement over fixed-dataset APR methods.

The paper tackles the problem of absolute pose regression (APR) failing in changing environments by proposing ConDo, a method that continually updates APR using unlabeled inference data, reducing localization error by over 7x (from 14.8m to 1.7m) on challenging scenes and achieving similar performance to re-training up to 25x faster.

Visual localization is a fundamental machine learning problem. Absolute Pose Regression (APR) trains a scene-dependent model to efficiently map an input image to the camera pose in a pre-defined scene. However, many applications have continually changing environments, where inference data at novel poses or scene conditions (weather, geometry) appear after deployment. Training APR on a fixed dataset leads to overfitting, making it fail catastrophically on challenging novel data. This work proposes Continual Domain Expansion (ConDo), which continually collects unlabeled inference data to update the deployed APR. Instead of applying standard unsupervised domain adaptation methods which are ineffective for APR, ConDo effectively learns from unlabeled data by distilling knowledge from scene-agnostic localization methods. By sampling data uniformly from historical and newly collected data, ConDo can effectively expand the generalization domain of APR. Large-scale benchmarks with various scene types are constructed to evaluate models under practical (long-term) data changes. ConDo consistently and significantly outperforms baselines across architectures, scene types, and data changes. On challenging scenes (Fig.1), it reduces the localization error by >7x (14.8m vs 1.7m). Analysis shows the robustness of ConDo against compute budgets, replay buffer sizes and teacher prediction noise. Comparing to model re-training, ConDo achieves similar performance up to 25x faster.

View on arXiv PDF Code

Similar