Self-supervised learning: When is fusion of the primary and secondary sensor cue useful?
This work addresses a theoretical and practical problem for robotics and sensor fusion, providing incremental insights into optimizing self-supervised learning methods.
The paper tackles the problem of determining when to fuse primary and secondary sensor cues in self-supervised learning for robots, proving that fusion is favorable under specific conditions such as a strong prior or accurate secondary cue, and validating this with computational experiments and a real-world case study showing improved height estimation in a flying robot.
Self-supervised learning (SSL) is a reliable learning mechanism in which a robot enhances its perceptual capabilities. Typically, in SSL a trusted, primary sensor cue provides supervised training data to a secondary sensor cue. In this article, a theoretical analysis is performed on the fusion of the primary and secondary cue in a minimal model of SSL. A proof is provided that determines the specific conditions under which it is favorable to perform fusion. In short, it is favorable when (i) the prior on the target value is strong or (ii) the secondary cue is sufficiently accurate. The theoretical findings are validated with computational experiments. Subsequently, a real-world case study is performed to investigate if fusion in SSL is also beneficial when assumptions of the minimal model are not met. In particular, a flying robot learns to map pressure measurements to sonar height measurements and then fuses the two, resulting in better height estimation. Fusion is also beneficial in the opposite case, when pressure is the primary cue. The analysis and results are encouraging to study SSL fusion also for other robots and sensors.