LGAIMAFeb 8, 2024

Risk-Sensitive Multi-Agent Reinforcement Learning in Network Aggregative Markov Games

arXiv:2402.05906v12 citationsh-index: 2AAMAS
AI Analysis

This addresses risk-sensitive decision-making for agents in MARL settings involving human-like preferences, but it is incremental as it extends existing methods to a specific game type.

The paper tackles the problem of incorporating risk-sensitive preferences into multi-agent reinforcement learning (MARL) by using cumulative prospect theory (CPT) in network aggregative Markov games, resulting in a distributed actor-critic algorithm that converges to a subjective Nash equilibrium and shows agents with higher loss aversion tend to socially isolate.

Classical multi-agent reinforcement learning (MARL) assumes risk neutrality and complete objectivity for agents. However, in settings where agents need to consider or model human economic or social preferences, a notion of risk must be incorporated into the RL optimization problem. This will be of greater importance in MARL where other human or non-human agents are involved, possibly with their own risk-sensitive policies. In this work, we consider risk-sensitive and non-cooperative MARL with cumulative prospect theory (CPT), a non-convex risk measure and a generalization of coherent measures of risk. CPT is capable of explaining loss aversion in humans and their tendency to overestimate/underestimate small/large probabilities. We propose a distributed sampling-based actor-critic (AC) algorithm with CPT risk for network aggregative Markov games (NAMGs), which we call Distributed Nested CPT-AC. Under a set of assumptions, we prove the convergence of the algorithm to a subjective notion of Markov perfect Nash equilibrium in NAMGs. The experimental results show that subjective CPT policies obtained by our algorithm can be different from the risk-neutral ones, and agents with a higher loss aversion are more inclined to socially isolate themselves in an NAMG.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes