LG AIFeb 19, 2025

Safe Learning Under Irreversible Dynamics via Asking for Help

Benjamin Plaut, Juan Liévano-Karim, Hanlin Zhu, Stuart Russell

arXiv:2502.14043v27.12 citationsh-index: 5

Originality Highly original

AI Analysis

This addresses the challenge of enabling agents to learn safely and become self-sufficient in high-stakes, unknown environments without resets, which is incremental as it builds on standard online learning assumptions.

The paper tackles the problem of safe learning in environments with irreversible errors by allowing an agent to ask for help from a mentor and transfer knowledge between states, resulting in an algorithm with sublinear regret and mentor queries for any Markov Decision Process.

Most learning algorithms with formal regret guarantees essentially rely on trying all possible behaviors, which is problematic when some errors cannot be recovered from. Instead, we allow the learning agent to ask for help from a mentor and to transfer knowledge between similar states. We show that this combination enables the agent to learn both safely and effectively. Under standard online learning assumptions, we provide an algorithm whose regret and number of mentor queries are both sublinear in the time horizon for any Markov Decision Process (MDP), including MDPs with irreversible dynamics. Our proof involves a sequence of three reductions which may be of independent interest. Conceptually, our result may be the first formal proof that it is possible for an agent to obtain high reward while becoming self-sufficient in an unknown, unbounded, and high-stakes environment without resets.

View on arXiv PDF

Similar