Meta-Inverse Reinforcement Learning for Mean Field Games via Probabilistic Context Variables
This addresses a practical limitation in designing rewards for interacting agents, though it is incremental as it extends existing IRL methods to handle heterogeneity without modifying the underlying MFG framework.
The paper tackles the problem of inferring reward functions from expert demonstrations in mean field games when agents have heterogeneous and unknown objectives, proposing a deep latent variable model that achieves superior performance over state-of-the-art methods in simulated and real-world scenarios.
Designing suitable reward functions for numerous interacting intelligent agents is challenging in real-world applications. Inverse reinforcement learning (IRL) in mean field games (MFGs) offers a practical framework to infer reward functions from expert demonstrations. While promising, the assumption of agent homogeneity limits the capability of existing methods to handle demonstrations with heterogeneous and unknown objectives, which are common in practice. To this end, we propose a deep latent variable MFG model and an associated IRL method. Critically, our method can infer rewards from different yet structurally similar tasks without prior knowledge about underlying contexts or modifying the MFG model itself. Our experiments, conducted on simulated scenarios and a real-world spatial taxi-ride pricing problem, demonstrate the superiority of our approach over state-of-the-art IRL methods in MFGs.