Learning to Forecast Aleatoric and Epistemic Uncertainties over Long Horizon Trajectories
This addresses safety and competency communication for autonomous agents, but is incremental as it builds on existing world model and uncertainty quantification methods.
The paper tackled the problem of forecasting both aleatoric and epistemic uncertainties over long-term trajectories for autonomous agents, using a deep generative world model, and showed calibrated uncertainty estimates on two reinforcement learning tasks.
Giving autonomous agents the ability to forecast their own outcomes and uncertainty will allow them to communicate their competencies and be used more safely. We accomplish this by using a learned world model of the agent system to forecast full agent trajectories over long time horizons. Real world systems involve significant sources of both aleatoric and epistemic uncertainty that compound and interact over time in the trajectory forecasts. We develop a deep generative world model that quantifies aleatoric uncertainty while incorporating the effects of epistemic uncertainty during the learning process. We show on two reinforcement learning problems that our uncertainty model produces calibrated outcome uncertainty estimates over the full trajectory horizon.