Robust Navigation with Cross-Modal Fusion and Knowledge Transfer
This work addresses the generalization and sim-to-real transfer challenges for mobile robot navigation, which is incremental as it builds on existing teacher-student distillation methods.
The paper tackles the problem of poor generalization and the simulation-reality gap in learning-based navigation for mobile robots, proposing a cross-modal fusion and knowledge transfer method that outperforms baselines by a large margin and achieves robust navigation performance in varying conditions.
Recently, learning-based approaches show promising results in navigation tasks. However, the poor generalization capability and the simulation-reality gap prevent a wide range of applications. We consider the problem of improving the generalization of mobile robots and achieving sim-to-real transfer for navigation skills. To that end, we propose a cross-modal fusion method and a knowledge transfer framework for better generalization. This is realized by a teacher-student distillation architecture. The teacher learns a discriminative representation and the near-perfect policy in an ideal environment. By imitating the behavior and representation of the teacher, the student is able to align the features from noisy multi-modal input and reduce the influence of variations on navigation policy. We evaluate our method in simulated and real-world environments. Experiments show that our method outperforms the baselines by a large margin and achieves robust navigation performance with varying working conditions.