AIOct 29, 2024

Predicting Future Actions of Reinforcement Learning Agents

Stephen Chung, Scott Niekum, David Krueger

arXiv:2410.22459v17.35 citationsh-index: 37Has CodeNIPS

Originality Synthesis-oriented

AI Analysis

This work addresses safety and interaction challenges for real-world RL deployments, though it is incremental as it compares existing prediction approaches on agent types.

The paper tackled the problem of predicting future actions and events of reinforcement learning agents to improve human-agent interaction and safety, finding that internal plans from explicitly planning agents are significantly more informative and robust for action prediction than neuron activations or simulation-based methods.

As reinforcement learning agents become increasingly deployed in real-world scenarios, predicting future agent actions and events during deployment is important for facilitating better human-agent interaction and preventing catastrophic outcomes. This paper experimentally evaluates and compares the effectiveness of future action and event prediction for three types of RL agents: explicitly planning, implicitly planning, and non-planning. We employ two approaches: the inner state approach, which involves predicting based on the inner computations of the agents (e.g., plans or neuron activations), and a simulation-based approach, which involves unrolling the agent in a learned world model. Our results show that the plans of explicitly planning agents are significantly more informative for prediction than the neuron activations of the other types. Furthermore, using internal plans proves more robust to model quality compared to simulation-based approaches when predicting actions, while the results for event prediction are more mixed. These findings highlight the benefits of leveraging inner states and simulations to predict future agent actions and events, thereby improving interaction and safety in real-world deployments.

View on arXiv PDF Code

Similar