AILGMAROSep 27, 2024

Intention-aware policy graphs: answering what, how, and why in opaque agents

arXiv:2409.19038v14 citationsh-index: 14
Originality Incremental advance
AI Analysis

This addresses the need for trustworthy AI by improving explainability for complex agents, though it appears incremental as it builds on existing graphical model techniques.

The paper tackles the problem of explaining emergent behavior in opaque AI agents by proposing a Probabilistic Graphical Model and pipeline to compute intentions, enabling answers to what, how, and why questions about agent actions, with contributions including measurements for interpretability and reliability.

Agents are a special kind of AI-based software in that they interact in complex environments and have increased potential for emergent behaviour. Explaining such emergent behaviour is key to deploying trustworthy AI, but the increasing complexity and opaque nature of many agent implementations makes this hard. In this work, we propose a Probabilistic Graphical Model along with a pipeline for designing such model -- by which the behaviour of an agent can be deliberated about -- and for computing a robust numerical value for the intentions the agent has at any moment. We contribute measurements that evaluate the interpretability and reliability of explanations provided, and enables explainability questions such as `what do you want to do now?' (e.g. deliver soup) `how do you plan to do it?' (e.g. returning a plan that considers its skills and the world), and `why would you take this action at this state?' (e.g. explaining how that furthers or hinders its own goals). This model can be constructed by taking partial observations of the agent's actions and world states, and we provide an iterative workflow for increasing the proposed measurements through better design and/or pointing out irrational agent behaviour.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes