Knowledge Distillation for Feature Extraction in Underwater VSLAM
This work addresses the problem of feature extraction in underwater environments for robotics and autonomous navigation, representing an incremental improvement by adapting existing methods to a specific domain.
The paper tackles the challenge of learning-based feature detection and matching in underwater visual SLAM by proposing a cross-modal knowledge distillation framework that uses synthetic underwater images from in-air RGBD data to train a network, achieving effectiveness as demonstrated on existing and new datasets.
In recent years, learning-based feature detection and matching have outperformed manually-designed methods in in-air cases. However, it is challenging to learn the features in the underwater scenario due to the absence of annotated underwater datasets. This paper proposes a cross-modal knowledge distillation framework for training an underwater feature detection and matching network (UFEN). In particular, we use in-air RGBD data to generate synthetic underwater images based on a physical underwater imaging formation model and employ these as the medium to distil knowledge from a teacher model SuperPoint pretrained on in-air images. We embed UFEN into the ORB-SLAM3 framework to replace the ORB feature by introducing an additional binarization layer. To test the effectiveness of our method, we built a new underwater dataset with groundtruth measurements named EASI (https://github.com/Jinghe-mel/UFEN-SLAM), recorded in an indoor water tank for different turbidity levels. The experimental results on the existing dataset and our new dataset demonstrate the effectiveness of our method.