LGCVMLJan 27, 2020

Rotation, Translation, and Cropping for Zero-Shot Generalization

arXiv:2001.09908v339 citations
AI Analysis

This work addresses the generalization issue in DRL for visual domains, which is crucial for deploying agents in varied real-world scenarios, though it is incremental as it builds on existing methods with specific modifications.

The paper tackles the problem of poor generalization in deep reinforcement learning agents trained on fixed environments by hypothesizing that input representation is a key factor, and demonstrates that applying rotation, cropping, and translation to observations improves generalization on unseen levels in 2D arcade games, achieving better performance on both human-designed and procedurally generated levels.

Deep Reinforcement Learning (DRL) has shown impressive performance on domains with visual inputs, in particular various games. However, the agent is usually trained on a fixed environment, e.g. a fixed number of levels. A growing mass of evidence suggests that these trained models fail to generalize to even slight variations of the environments they were trained on. This paper advances the hypothesis that the lack of generalization is partly due to the input representation, and explores how rotation, cropping and translation could increase generality. We show that a cropped, translated and rotated observation can get better generalization on unseen levels of two-dimensional arcade games from the GVGAI framework. The generality of the agents is evaluated on both human-designed and procedurally generated levels.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes