LG MLDec 31, 2024

Toward Information Theoretic Active Inverse Reinforcement Learning

Ondrej Bajgar, Sid William Gould, Rohan Narayan Langford Mitta, Jonathon Liu, Oliver Newcombe, Jack Golden

arXiv:2501.00381v14.1h-index: 6

Originality Incremental advance

AI Analysis

This work addresses the challenge of aligning AI decision-making with human preferences in domains like autonomous driving, though it is incremental as it builds on prior active IRL methods by extending to longer trajectories.

The paper tackles the problem of reducing human effort in inverse reinforcement learning by proposing an information-theoretic active learning approach that selects informative scenarios for demonstrations, showing improved performance in gridworld experiments.

As AI systems become increasingly autonomous, aligning their decision-making to human preferences is essential. In domains like autonomous driving or robotics, it is impossible to write down the reward function representing these preferences by hand. Inverse reinforcement learning (IRL) offers a promising approach to infer the unknown reward from demonstrations. However, obtaining human demonstrations can be costly. Active IRL addresses this challenge by strategically selecting the most informative scenarios for human demonstration, reducing the amount of required human effort. Where most prior work allowed querying the human for an action at one state at a time, we motivate and analyse scenarios where we collect longer trajectories. We provide an information-theoretic acquisition function, propose an efficient approximation scheme, and illustrate its performance through a set of gridworld experiments as groundwork for future work expanding to more general settings.

View on arXiv PDF

Similar