CVLGNEDec 22, 2014

Multi-modal Sensor Registration for Vehicle Perception via Deep Neural Networks

arXiv:1412.7006v230 citations
Originality Incremental advance
AI Analysis

This addresses the need for reliable multi-modal registration to enhance perception in automated vehicles, but appears incremental as it builds on existing deep learning approaches for sensor fusion.

The paper tackles the problem of spatio-temporal alignment for LiDAR-video systems in automated vehicles by developing a deep learning method to detect misalignment, tested on the Ford LiDAR-video driving dataset.

The ability to simultaneously leverage multiple modes of sensor information is critical for perception of an automated vehicle's physical surroundings. Spatio-temporal alignment of registration of the incoming information is often a prerequisite to analyzing the fused data. The persistence and reliability of multi-modal registration is therefore the key to the stability of decision support systems ingesting the fused information. LiDAR-video systems like on those many driverless cars are a common example of where keeping the LiDAR and video channels registered to common physical features is important. We develop a deep learning method that takes multiple channels of heterogeneous data, to detect the misalignment of the LiDAR-video inputs. A number of variations were tested on the Ford LiDAR-video driving test data set and will be discussed. To the best of our knowledge the use of multi-modal deep convolutional neural networks for dynamic real-time LiDAR-video registration has not been presented.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes