ROLGMar 3, 2022

Quantity over Quality: Training an AV Motion Planner with Large Scale Commodity Vision Data

arXiv:2203.01681v21 citationsh-index: 27
AI Analysis

This reduces financial barriers for scaling self-driving systems by enabling cost-effective data collection to handle rare driving events.

The paper tackles the high cost of collecting expert driving data for autonomous vehicle motion planning by using cheaper commodity vision sensors instead of expensive HD sensor suites, and demonstrates that training on 100 hours of commodity vision data outperforms training on 25 hours of HD data.

With the Autonomous Vehicle (AV) industry shifting towards machine-learned approaches for motion planning, the performance of self-driving systems is starting to rely heavily on large quantities of expert driving demonstrations. However, collecting this demonstration data typically involves expensive HD sensor suites (LiDAR + RADAR + cameras), which quickly becomes financially infeasible at the scales required. This motivates the use of commodity sensors like cameras for data collection, which are an order of magnitude cheaper than HD sensor suites, but offer lower fidelity. Leveraging these sensors for training an AV motion planner opens a financially viable path to observe the `long tail' of driving events. As our main contribution we show it is possible to train a high-performance motion planner using commodity vision data which outperforms planners trained on HD-sensor data for a fraction of the cost. To the best of our knowledge, we are the first to demonstrate this using real-world data. We compare the performance of the autonomy system on these two different sensor configurations, and show that we can compensate for the lower sensor fidelity by means of increased quantity: a planner trained on 100h of commodity vision data outperforms the one with 25h of expensive HD data. We also share the engineering challenges we had to tackle to make this work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes