LG AI CV ROSep 22, 2023

Diagnosing and exploiting the computational demands of videos games for deep reinforcement learning

Lakshmi Narasimhan Govindarajan, Rex G Liu, Drew Linsley, Alekh Karkada Ashok, Max Reuter, Michael J Frank, Thomas Serre

arXiv:2309.13181v12.0h-index: 75

Originality Incremental advance

AI Analysis

This work addresses a foundational issue in AI by diagnosing computational demands in video games for dRL, potentially improving efficiency in algorithm development, though it is incremental as it builds on existing benchmarks and methods.

The authors tackled the problem of understanding whether deep reinforcement learning (dRL) successes in video games stem from visual representation learning or reinforcement learning algorithms by introducing the Learning Challenge Diagnosticator (LCD) tool. They used LCD to analyze the Procgen benchmark, discovering a novel taxonomy of challenges and demonstrating its reliability and utility for guiding algorithmic development.

Humans learn by interacting with their environments and perceiving the outcomes of their actions. A landmark in artificial intelligence has been the development of deep reinforcement learning (dRL) algorithms capable of doing the same in video games, on par with or better than humans. However, it remains unclear whether the successes of dRL models reflect advances in visual representation learning, the effectiveness of reinforcement learning algorithms at discovering better policies, or both. To address this question, we introduce the Learning Challenge Diagnosticator (LCD), a tool that separately measures the perceptual and reinforcement learning demands of a task. We use LCD to discover a novel taxonomy of challenges in the Procgen benchmark, and demonstrate that these predictions are both highly reliable and can instruct algorithmic development. More broadly, the LCD reveals multiple failure cases that can occur when optimizing dRL algorithms over entire video game benchmarks like Procgen, and provides a pathway towards more efficient progress.

View on arXiv PDF

Similar