INF: Implicit Neural Fusion for LiDAR and Camera
This addresses sensor fusion challenges in robotics, such as data representation differences and calibration issues, but it is incremental as it builds on existing implicit neural representation methods.
The paper tackles the problem of LiDAR-camera sensor fusion by proposing Implicit Neural Fusion (INF), which uses implicit neural representations to unify scene information and jointly estimate poses and extrinsic parameters, achieving high accuracy and stable performance in experiments.
Sensor fusion has become a popular topic in robotics. However, conventional fusion methods encounter many difficulties, such as data representation differences, sensor variations, and extrinsic calibration. For example, the calibration methods used for LiDAR-camera fusion often require manual operation and auxiliary calibration targets. Implicit neural representations (INRs) have been developed for 3D scenes, and the volume density distribution involved in an INR unifies the scene information obtained by different types of sensors. Therefore, we propose implicit neural fusion (INF) for LiDAR and camera. INF first trains a neural density field of the target scene using LiDAR frames. Then, a separate neural color field is trained using camera images and the trained neural density field. Along with the training process, INF both estimates LiDAR poses and optimizes extrinsic parameters. Our experiments demonstrate the high accuracy and stable performance of the proposed method.