AIMar 12, 2025

PairVDN - Pair-wise Decomposed Value Functions

arXiv:2503.09521v11 citationsh-index: 1Has Code

Originality Incremental advance

AI Analysis

This work addresses the credit assignment problem in multi-agent reinforcement learning, offering a novel approach for scenarios where per-agent value decomposition is insufficient, though it is incremental as it builds on existing value decomposition methods.

The paper tackles the challenge of applying deep Q-learning to cooperative multi-agent settings by proposing PairVDN, a method that decomposes the value function into pair-wise functions to improve expressivity, and demonstrates improved performance over baselines like VDN and QMIX in a new many-agent environment called Box Jump.

Extending deep Q-learning to cooperative multi-agent settings is challenging due to the exponential growth of the joint action space, the non-stationary environment, and the credit assignment problem. Value decomposition allows deep Q-learning to be applied at the joint agent level, at the cost of reduced expressivity. Building on past work in this direction, our paper proposes PairVDN, a novel method for decomposing the value function into a collection of pair-wise, rather than per-agent, functions, improving expressivity at the cost of requiring a more complex (but still efficient) dynamic programming maximisation algorithm. Our method enables the representation of value functions which cannot be expressed as a monotonic combination of per-agent functions, unlike past approaches such as VDN and QMIX. We implement a novel many-agent cooperative environment, Box Jump, and demonstrate improved performance over these baselines in this setting. We open-source our code and environment at https://github.com/zzbuzzard/PairVDN.

View on arXiv PDF Code

Similar