LGSep 29, 2025

Safe In-Context Reinforcement Learning

Amir Moeini, Minjae Kwon, Alper Kamil Bozkurt, Yuichi Motai, Rohan Chandra, Lu Feng, Shangtong Zhang

arXiv:2509.25582v113.04 citationsh-index: 6

Originality Incremental advance

AI Analysis

This work addresses safety concerns in ICRL for applications requiring risk-sensitive adaptation, though it is incremental as it extends existing ICRL methods with safety constraints.

The authors tackled the problem of ensuring safety during in-context reinforcement learning (ICRL) adaptation by proposing the first method that incorporates safety constraints into ICRL within the framework of constrained Markov Decision Processes, resulting in an agent that maximizes reward while minimizing cost and actively adjusts its behavior based on cost tolerance thresholds.

In-context reinforcement learning (ICRL) is an emerging RL paradigm where the agent, after some pretraining procedure, is able to adapt to out-of-distribution test tasks without any parameter updates. The agent achieves this by continually expanding the input (i.e., the context) to its policy neural networks. For example, the input could be all the history experience that the agent has access to until the current time step. The agent's performance improves as the input grows, without any parameter updates. In this work, we propose the first method that promotes the safety of ICRL's adaptation process in the framework of constrained Markov Decision Processes. In other words, during the parameter-update-free adaptation process, the agent not only maximizes the reward but also minimizes an additional cost function. We also demonstrate that our agent actively reacts to the threshold (i.e., budget) of the cost tolerance. With a higher cost budget, the agent behaves more aggressively, and with a lower cost budget, the agent behaves more conservatively.

View on arXiv PDF

Similar