Deep Unsupervised Common Representation Learning for LiDAR and Camera Data using Double Siamese Networks
This addresses sensor modality integration for autonomous robots, but it appears incremental as it builds on existing unsupervised and Siamese network techniques.
The paper tackled the problem of domain gaps between LiDAR and camera data for autonomous robots by proposing two unsupervised frameworks for common representation learning, achieving results evaluated with common computer vision applications.
Domain gaps of sensor modalities pose a challenge for the design of autonomous robots. Taking a step towards closing this gap, we propose two unsupervised training frameworks for finding a common representation of LiDAR and camera data. The first method utilizes a double Siamese training structure to ensure consistency in the results. The second method uses a Canny edge image guiding the networks towards a desired representation. All networks are trained in an unsupervised manner, leaving room for scalability. The results are evaluated using common computer vision applications, and the limitations of the proposed approaches are outlined.