Self-Supervised Domain Adaptation for Visual Navigation with Global Map Consistency
This addresses domain adaptation for embodied agents in robotics, enabling generalization to unseen noisy environments, though it is incremental as it builds on existing self-supervised techniques.
The paper tackles the problem of adapting a visual navigation agent from noiseless to noisy environments by proposing a self-supervised method that maximizes global map consistency in round-trip trajectories, resulting in improved localization, mapping accuracy, and downstream task performance without requiring ground-truth data.
We propose a light-weight, self-supervised adaptation for a visual navigation agent to generalize to unseen environment. Given an embodied agent trained in a noiseless environment, our objective is to transfer the agent to a noisy environment where actuation and odometry sensor noise is present. Our method encourages the agent to maximize the consistency between the global maps generated at different time steps in a round-trip trajectory. The proposed task is completely self-supervised, not requiring any supervision from ground-truth pose data or explicit noise model. In addition, optimization of the task objective is extremely light-weight, as training terminates within a few minutes on a commodity GPU. Our experiments show that the proposed task helps the agent to successfully transfer to new, noisy environments. The transferred agent exhibits improved localization and mapping accuracy, further leading to enhanced performance in downstream visual navigation tasks. Moreover, we demonstrate test-time adaptation with our self-supervised task to show its potential applicability in real-world deployment.