Joint Action Language Modelling for Transparent Policy Execution
This addresses the issue of transparency in embodied AI agents for researchers and practitioners, though it is incremental as it builds on existing autoregressive methods.
The paper tackles the problem of opaque agent behavior by integrating natural language descriptions of actions into policy learning, transforming it into a language generation task combined with autoregressive modeling. The result shows that generating actions and transparent statements simultaneously often improves both the action trajectory quality and language output in the Language-Table environment.
An agent's intention often remains hidden behind the black-box nature of embodied policies. Communication using natural language statements that describe the next action can provide transparency towards the agent's behavior. We aim to insert transparent behavior directly into the learning process, by transforming the problem of policy learning into a language generation problem and combining it with traditional autoregressive modelling. The resulting model produces transparent natural language statements followed by tokens representing the specific actions to solve long-horizon tasks in the Language-Table environment. Following previous work, the model is able to learn to produce a policy represented by special discretized tokens in an autoregressive manner. We place special emphasis on investigating the relationship between predicting actions and producing high-quality language for a transparent agent. We find that in many cases both the quality of the action trajectory and the transparent statement increase when they are generated simultaneously.