NE MA PEDec 17, 2018

Malthusian Reinforcement Learning

Joel Z. Leibo, Julien Perolat, Edward Hughes, Steven Wheelwright, Adam H. Marblestone, Edgar Duéñez-Guzmán, Peter Sunehag, Iain Dunning, Thore Graepel

arXiv:1812.07019v217.841 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of exploration and synergy in multi-agent systems for AI researchers, though it appears incremental as it builds on self-play.

The paper tackles the problem of multi-agent reinforcement learning by introducing Malthusian reinforcement learning, a framework that uses fitness-linked population size dynamics to drive ongoing innovation, and shows it can better exploit specialization and division of labor compared to self-play algorithms.

Here we explore a new algorithmic framework for multi-agent reinforcement learning, called Malthusian reinforcement learning, which extends self-play to include fitness-linked population size dynamics that drive ongoing innovation. In Malthusian RL, increases in a subpopulation's average return drive subsequent increases in its size, just as Thomas Malthus argued in 1798 was the relationship between preindustrial income levels and population growth. Malthusian reinforcement learning harnesses the competitive pressures arising from growing and shrinking population size to drive agents to explore regions of state and policy spaces that they could not otherwise reach. Furthermore, in environments where there are potential gains from specialization and division of labor, we show that Malthusian reinforcement learning is better positioned to take advantage of such synergies than algorithms based on self-play.

View on arXiv PDF

Similar