LGAIROApr 25, 2023

A Closer Look at Reward Decomposition for High-Level Robotic Explanations

arXiv:2304.12958v212 citationsh-index: 46
Originality Incremental advance
AI Analysis

This work addresses the problem of improving transparency and explainability in robotic systems for human users, representing an incremental advancement by integrating existing techniques like reward decomposition with abstracted actions.

The paper tackles the challenge of explaining reinforcement learning agent behavior in robotics by proposing an explainable Q-Map framework that combines reward decomposition with abstracted actions, resulting in non-ambiguous, high-level explanations based on object properties, demonstrated through quantitative and qualitative analysis in two robotic scenarios.

Explaining the behaviour of intelligent agents learned by reinforcement learning (RL) to humans is challenging yet crucial due to their incomprehensible proprioceptive states, variational intermediate goals, and resultant unpredictability. Moreover, one-step explanations for RL agents can be ambiguous as they fail to account for the agent's future behaviour at each transition, adding to the complexity of explaining robot actions. By leveraging abstracted actions that map to task-specific primitives, we avoid explanations on the movement level. To further improve the transparency and explainability of robotic systems, we propose an explainable Q-Map learning framework that combines reward decomposition (RD) with abstracted action spaces, allowing for non-ambiguous and high-level explanations based on object properties in the task. We demonstrate the effectiveness of our framework through quantitative and qualitative analysis of two robotic scenarios, showcasing visual and textual explanations, from output artefacts of RD explanations, that are easy for humans to comprehend. Additionally, we demonstrate the versatility of integrating these artefacts with large language models (LLMs) for reasoning and interactive querying.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes