LG AIAug 22, 2024

PCGRL+: Scaling, Control and Generalization in Reinforcement Learning Level Generators

arXiv:2408.12525v112.57 citationsh-index: 10

Originality Incremental advance

AI Analysis

This work addresses scalability and control issues in PCGRL for game designers, though it is incremental as it builds on prior PCGRL frameworks with optimizations and extensions.

The paper tackled the computational limitations of Procedural Content Generation via Reinforcement Learning (PCGRL) for game level generation by implementing environments in Jax to enable GPU-parallelized training, achieving significantly improved training speed and scaling to 1 billion timesteps. It also introduced methods like randomized level sizes and frozen tiles for better control and found that partial observation sizes enhance generalization to large, out-of-distribution maps.

Procedural Content Generation via Reinforcement Learning (PCGRL) has been introduced as a means by which controllable designer agents can be trained based only on a set of computable metrics acting as a proxy for the level's quality and key characteristics. While PCGRL offers a unique set of affordances for game designers, it is constrained by the compute-intensive process of training RL agents, and has so far been limited to generating relatively small levels. To address this issue of scale, we implement several PCGRL environments in Jax so that all aspects of learning and simulation happen in parallel on the GPU, resulting in faster environment simulation; removing the CPU-GPU transfer of information bottleneck during RL training; and ultimately resulting in significantly improved training speed. We replicate several key results from prior works in this new framework, letting models train for much longer than previously studied, and evaluating their behavior after 1 billion timesteps. Aiming for greater control for human designers, we introduce randomized level sizes and frozen "pinpoints" of pivotal game tiles as further ways of countering overfitting. To test the generalization ability of learned generators, we evaluate models on large, out-of-distribution map sizes, and find that partial observation sizes learn more robust design strategies.

View on arXiv PDF

Similar