LG GT MA OCOct 31, 2025

Aspiration-based Perturbed Learning Automata in Games with Noisy Utility Measurements. Part A: Stochastic Stability in Non-zero-Sum Games

arXiv:2511.11602v14.11 citationsh-index: 14ECC

Originality Incremental advance

AI Analysis

This work addresses a key limitation in reinforcement learning for distributed systems, enabling more robust optimization in noisy environments, though it is incremental in extending stability analysis to broader game classes.

The paper tackles the problem of distributed optimization in multi-player games with noisy utility measurements by introducing aspiration-based perturbed learning automata (APLA), a novel payoff-based learning scheme that ensures convergence to pure Nash equilibria in weakly-cyclic games, unlike prior methods limited to potential and coordination games.

Reinforcement-based learning has attracted considerable attention both in modeling human behavior as well as in engineering, for designing measurement- or payoff-based optimization schemes. Such learning schemes exhibit several advantages, especially in relation to filtering out noisy observations. However, they may exhibit several limitations when applied in a distributed setup. In multi-player weakly-acyclic games, and when each player applies an independent copy of the learning dynamics, convergence to (usually desirable) pure Nash equilibria cannot be guaranteed. Prior work has only focused on a small class of games, namely potential and coordination games. To address this main limitation, this paper introduces a novel payoff-based learning scheme for distributed optimization, namely aspiration-based perturbed learning automata (APLA). In this class of dynamics, and contrary to standard reinforcement-based learning schemes, each player's probability distribution for selecting actions is reinforced both by repeated selection and an aspiration factor that captures the player's satisfaction level. We provide a stochastic stability analysis of APLA in multi-player positive-utility games under the presence of noisy observations. This is the first part of the paper that characterizes stochastic stability in generic non-zero-sum games by establishing equivalence of the induced infinite-dimensional Markov chain with a finite dimensional one. In the second part, stochastic stability is further specialized to weakly acyclic games.

View on arXiv PDF

Similar