CVFeb 7, 2023

Scaling Vision-based End-to-End Driving with Multi-View Attention Learning

arXiv:2302.03198v312 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work addresses the challenge of developing cost-effective and maintainable autonomous driving systems for the automotive industry, though it is incremental as it builds on an existing baseline.

The paper tackles the problem of improving vision-based end-to-end driving models by enhancing CILRS with higher-resolution images and an attention mechanism, resulting in CIL++ that achieves competitive performance compared to more costly models.

On end-to-end driving, human driving demonstrations are used to train perception-based driving models by imitation learning. This process is supervised on vehicle signals (e.g., steering angle, acceleration) but does not require extra costly supervision (human labeling of sensor data). As a representative of such vision-based end-to-end driving models, CILRS is commonly used as a baseline to compare with new driving models. So far, some latest models achieve better performance than CILRS by using expensive sensor suites and/or by using large amounts of human-labeled data for training. Given the difference in performance, one may think that it is not worth pursuing vision-based pure end-to-end driving. However, we argue that this approach still has great value and potential considering cost and maintenance. In this paper, we present CIL++, which improves on CILRS by both processing higher-resolution images using a human-inspired HFOV as an inductive bias and incorporating a proper attention mechanism. CIL++ achieves competitive performance compared to models which are more costly to develop. We propose to replace CILRS with CIL++ as a strong vision-based pure end-to-end driving baseline supervised by only vehicle signals and trained by conditional imitation learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes