AICVJul 25, 2018

Attend Before you Act: Leveraging human visual attention for continual learning

arXiv:1807.09664v17 citations
Originality Synthesis-oriented
AI Analysis

This work addresses continual learning challenges for AI agents in navigation tasks, but it appears incremental as it builds on existing architectures and methods.

The paper tackled the problem of improving continual learning in 3D navigation by leveraging human visual attention, using foveated images based on saliency maps to train an agent, and measured performance in noisy environments.

When humans perform a task, such as playing a game, they selectively pay attention to certain parts of the visual input, gathering relevant information and sequentially combining it to build a representation from the sensory data. In this work, we explore leveraging where humans look in an image as an implicit indication of what is salient for decision making. We build on top of the UNREAL architecture in DeepMind Lab's 3D navigation maze environment. We train the agent both with original images and foveated images, which were generated by overlaying the original images with saliency maps generated using a real-time spectral residual technique. We investigate the effectiveness of this approach in transfer learning by measuring performance in the context of noise in the environment.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes