LG AI MLDec 13, 2017

Multi-focus Attention Network for Efficient Deep Reinforcement Learning

Jinyoung Choi, Beom-Jin Lee, Byoung-Tak Zhang

arXiv:1712.04603v112.952 citations

Originality Incremental advance

AI Analysis

This addresses the sample inefficiency problem in DRL for researchers and practitioners, offering an incremental improvement over existing attention-based methods.

The paper tackles the inefficiency of deep reinforcement learning (DRL) models that require vast experience samples by proposing a Multi-focus Attention Network (MANet) that mimics human perception to spatially abstract sensory input into entities and attend to them simultaneously, resulting in highest scores with significantly less experience samples and 20% faster learning in multi-agent tasks.

Deep reinforcement learning (DRL) has shown incredible performance in learning various tasks to the human level. However, unlike human perception, current DRL models connect the entire low-level sensory input to the state-action values rather than exploiting the relationship between and among entities that constitute the sensory input. Because of this difference, DRL needs vast amount of experience samples to learn. In this paper, we propose a Multi-focus Attention Network (MANet) which mimics human ability to spatially abstract the low-level sensory input into multiple entities and attend to them simultaneously. The proposed method first divides the low-level input into several segments which we refer to as partial states. After this segmentation, parallel attention layers attend to the partial states relevant to solving the task. Our model estimates state-action values using these attended partial states. In our experiments, MANet attains highest scores with significantly less experience samples. Additionally, the model shows higher performance compared to the Deep Q-network and the single attention model as benchmarks. Furthermore, we extend our model to attentive communication model for performing multi-agent cooperative tasks. In multi-agent cooperative task experiments, our model shows 20% faster learning than existing state-of-the-art model.

View on arXiv PDF

Similar