LGAIMar 8, 2021

Adversarial Reinforcement Learning for Procedural Content Generation

arXiv:2103.04847v268 citations
AI Analysis

This work addresses the problem of improving generalization for RL agents in game development by generating diverse and controllable environments, though it is incremental as it builds on existing adversarial and procedural generation methods.

The paper tackles the challenge of training reinforcement learning agents in novel environments by introducing ARLPCG, an adversarial reinforcement learning approach for procedural content generation, which results in a significantly better solve ratio in 3D game environments like platformers and racing games.

We present a new approach ARLPCG: Adversarial Reinforcement Learning for Procedural Content Generation, which procedurally generates and tests previously unseen environments with an auxiliary input as a control variable. Training RL agents over novel environments is a notoriously difficult task. One popular approach is to procedurally generate different environments to increase the generalizability of the trained agents. ARLPCG instead deploys an adversarial model with one PCG RL agent (called Generator) and one solving RL agent (called Solver). The Generator receives a reward signal based on the Solver's performance, which encourages the environment design to be challenging but not impossible. To further drive diversity and control of the environment generation, we propose using auxiliary inputs for the Generator. The benefit is two-fold: Firstly, the Solver achieves better generalization through the Generator's generated challenges. Secondly, the trained Generator can be used as a creator of novel environments that, together with the Solver, can be shown to be solvable. We create two types of 3D environments to validate our model, representing two popular game genres: a third-person platformer and a racing game. In these cases, we shows that ARLPCG has a significantly better solve ratio, and that the auxiliary inputs renders the levels creation controllable to a certain degree. For a video compilation of the results please visit https://youtu.be/z7q2PtVsT0I.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes