RO LGDec 14, 2022

Particle-Based Score Estimation for State Space Model Learning in Autonomous Driving

Angad Singh, Omar Makhlouf, Maximilian Igl, Joao Messias, Arnaud Doucet, Shimon Whiteson

arXiv:2212.06968v15.53 citationsh-index: 89

Originality Incremental advance

AI Analysis

This work addresses a specific bottleneck in multi-object state estimation for autonomous driving, offering an incremental improvement over prior particle filtering techniques.

The paper tackles the problem of learning maximum-likelihood parameters for state space models in autonomous driving, where existing methods suffer from biased or high-variance gradient estimates due to non-differentiable resampling in particle filters. The result is a particle-based score estimation method that learns better models and is more stable in training, as demonstrated on real autonomous vehicle data.

Multi-object state estimation is a fundamental problem for robotic applications where a robot must interact with other moving objects. Typically, other objects' relevant state features are not directly observable, and must instead be inferred from observations. Particle filtering can perform such inference given approximate transition and observation models. However, these models are often unknown a priori, yielding a difficult parameter estimation problem since observations jointly carry transition and observation noise. In this work, we consider learning maximum-likelihood parameters using particle methods. Recent methods addressing this problem typically differentiate through time in a particle filter, which requires workarounds to the non-differentiable resampling step, that yield biased or high variance gradient estimates. By contrast, we exploit Fisher's identity to obtain a particle-based approximation of the score function (the gradient of the log likelihood) that yields a low variance estimate while only requiring stepwise differentiation through the transition and observation models. We apply our method to real data collected from autonomous vehicles (AVs) and show that it learns better models than existing techniques and is more stable in training, yielding an effective smoother for tracking the trajectories of vehicles around an AV.

View on arXiv PDF

Similar