AIFeb 24, 2023

Towards Computationally Efficient Responsibility Attribution in Decentralized Partially Observable MDPs

arXiv:2302.12676v16.75 citationsh-index: 16Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of accountable decision-making in decentralized partially observable environments, offering an incremental improvement for multi-agent systems where exact responsibility attribution is infeasible.

The paper tackles the computationally intractable problem of responsibility attribution in multi-agent systems by introducing a Monte Carlo Tree Search-based method to approximate agents' degrees of responsibility efficiently under a computational budget, achieving practical algorithmic solutions as demonstrated in simulation-based evaluations with team-based card games.

Responsibility attribution is a key concept of accountable multi-agent decision making. Given a sequence of actions, responsibility attribution mechanisms quantify the impact of each participating agent to the final outcome. One such popular mechanism is based on actual causality, and it assigns (causal) responsibility based on the actions that were found to be pivotal for the considered outcome. However, the inherent problem of pinpointing actual causes and consequently determining the exact responsibility assignment has shown to be computationally intractable. In this paper, we aim to provide a practical algorithmic solution to the problem of responsibility attribution under a computational budget. We first formalize the problem in the framework of Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) augmented by a specific class of Structural Causal Models (SCMs). Under this framework, we introduce a Monte Carlo Tree Search (MCTS) type of method which efficiently approximates the agents' degrees of responsibility. This method utilizes the structure of a novel search tree and a pruning technique, both tailored to the problem of responsibility attribution. Other novel components of our method are (a) a child selection policy based on linear scalarization and (b) a backpropagation procedure that accounts for a minimality condition that is typically used to define actual causality. We experimentally evaluate the efficacy of our algorithm through a simulation-based test-bed, which includes three team-based card games.

View on arXiv PDF Code

Similar