ROCVLGSep 18, 2020

Multi-modal Experts Network for Autonomous Driving

arXiv:2009.08876v17 citations
Originality Incremental advance
AI Analysis

This work addresses computational and overfitting issues for autonomous driving systems using multiple sensors, but it is incremental as it builds on existing end-to-end learning approaches.

The paper tackles the challenges of computational complexity and overfitting in multi-sensor autonomous driving systems by introducing a novel multi-modal experts network with a gating mechanism and multi-stage training, demonstrating its plausibility on a 1/6 scale truck with three cameras and one LiDAR.

End-to-end learning from sensory data has shown promising results in autonomous driving. While employing many sensors enhances world perception and should lead to more robust and reliable behavior of autonomous vehicles, it is challenging to train and deploy such network and at least two problems are encountered in the considered setting. The first one is the increase of computational complexity with the number of sensing devices. The other is the phenomena of network overfitting to the simplest and most informative input. We address both challenges with a novel, carefully tailored multi-modal experts network architecture and propose a multi-stage training procedure. The network contains a gating mechanism, which selects the most relevant input at each inference time step using a mixed discrete-continuous policy. We demonstrate the plausibility of the proposed approach on our 1/6 scale truck equipped with three cameras and one LiDAR.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes