CW-ERM: Improving Autonomous Driving Planning with Closed-loop Weighted Empirical Risk Minimization
This work addresses the practical issue of poor real-world performance in self-driving vehicle policies, though it appears incremental as it builds on existing ERM methods with a closed-loop weighting approach.
The paper tackled the problem of imitation learning for autonomous driving policies, which often performs poorly in real-world closed-loop evaluations due to open-loop training bias, and introduced CW-ERM to reduce collisions and improve non-differentiable metrics in urban driving.
The imitation learning of self-driving vehicle policies through behavioral cloning is often carried out in an open-loop fashion, ignoring the effect of actions to future states. Training such policies purely with Empirical Risk Minimization (ERM) can be detrimental to real-world performance, as it biases policy networks towards matching only open-loop behavior, showing poor results when evaluated in closed-loop. In this work, we develop an efficient and simple-to-implement principle called Closed-loop Weighted Empirical Risk Minimization (CW-ERM), in which a closed-loop evaluation procedure is first used to identify training data samples that are important for practical driving performance and then we these samples to help debias the policy network. We evaluate CW-ERM in a challenging urban driving dataset and show that this procedure yields a significant reduction in collisions as well as other non-differentiable closed-loop metrics.