CVAIJul 20, 2021

Critic Guided Segmentation of Rewarding Objects in First-Person Views

arXiv:2107.09540v114 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of object segmentation in complex 3D environments with sparse rewards, which is incremental as it builds on existing imitation learning and critic methods.

The paper tackles the problem of segmenting rewarding objects in first-person views using only sparse reward signals from an imitation learning dataset, achieving first place in the NeurIPS 2020 MineRL Competition Track.

This work discusses a learning approach to mask rewarding objects in images using sparse reward signals from an imitation learning dataset. For that, we train an Hourglass network using only feedback from a critic model. The Hourglass network learns to produce a mask to decrease the critic's score of a high score image and increase the critic's score of a low score image by swapping the masked areas between these two images. We trained the model on an imitation learning dataset from the NeurIPS 2020 MineRL Competition Track, where our model learned to mask rewarding objects in a complex interactive 3D environment with a sparse reward signal. This approach was part of the 1st place winning solution in this competition. Video demonstration and code: https://rebrand.ly/critic-guided-segmentation

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes