LG MLJul 15, 2020

Newton Optimization on Helmholtz Decomposition for Continuous Games

arXiv:2007.07804v36.55 citations

Originality Highly original

AI Analysis

This addresses the problem of convergence in multi-agent learning for researchers, offering a novel method that improves upon existing algorithms, though it is incremental in its application to known bottlenecks.

The paper tackles the challenge of multi-agent learning where standard policy gradient algorithms fail due to non-stationarity and conflicting interests, proposing NOHD, a Newton-like algorithm based on Helmholtz decomposition, which ensures quadratic convergence in specific systems and is attracted to stable fixed points, with empirical comparisons showing performance improvements on bimatrix games and a continuous Gridworld environment.

Many learning problems involve multiple agents optimizing different interactive functions. In these problems, the standard policy gradient algorithms fail due to the non-stationarity of the setting and the different interests of each agent. In fact, algorithms must take into account the complex dynamics of these systems to guarantee rapid convergence towards a (local) Nash equilibrium. In this paper, we propose NOHD (Newton Optimization on Helmholtz Decomposition), a Newton-like algorithm for multi-agent learning problems based on the decomposition of the dynamics of the system in its irrotational (Potential) and solenoidal (Hamiltonian) component. This method ensures quadratic convergence in purely irrotational systems and pure solenoidal systems. Furthermore, we show that NOHD is attracted to stable fixed points in general multi-agent systems and repelled by strict saddle ones. Finally, we empirically compare the NOHD's performance with that of state-of-the-art algorithms on some bimatrix games and in a continuous Gridworld environment.

View on arXiv PDF

Similar