LGMLJun 17, 2018

Task-Relevant Object Discovery and Categorization for Playing First-person Shooter Games

arXiv:1806.06392v12 citations
AI Analysis

This addresses data efficiency for reinforcement learning in complex video games, but is incremental as it builds on existing representation learning methods.

The paper tackles the problem of learning to play first-person shooter games with high-dimensional observations by developing a method to discover and categorize task-relevant objects from raw screen images, reducing data needs and improving agent performance in Doom experiments.

We consider the problem of learning to play first-person shooter (FPS) video games using raw screen images as observations and keyboard inputs as actions. The high-dimensionality of the observations in this type of applications leads to prohibitive needs of training data for model-free methods, such as the deep Q-network (DQN), and its recurrent variant DRQN. Thus, recent works focused on learning low-dimensional representations that may reduce the need for data. This paper presents a new and efficient method for learning such representations. Salient segments of consecutive frames are detected from their optical flow, and clustered based on their feature descriptors. The clusters typically correspond to different discovered categories of objects. Segments detected in new frames are then classified based on their nearest clusters. Because only a few categories are relevant to a given task, the importance of a category is defined as the correlation between its occurrence and the agent's performance. The result is encoded as a vector indicating objects that are in the frame and their locations, and used as a side input to DRQN. Experiments on the game Doom provide a good evidence for the benefit of this approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes