LGCVMLJul 18, 2020

Slot Contrastive Networks: A Contrastive Approach for Representing Objects

arXiv:2007.09294v114 citations
Originality Incremental advance
AI Analysis

This addresses the problem of missing low-contrast or small objects in unsupervised learning for computer vision, though it appears incremental by building on existing contrastive and slot-based approaches.

The paper tackles unsupervised object extraction from low-level visual data by introducing a contrastive method that leverages object motion to avoid reconstructing background pixels, achieving improved performance on 20 Atari games with a new diversity metric for evaluation.

Unsupervised extraction of objects from low-level visual data is an important goal for further progress in machine learning. Existing approaches for representing objects without labels use structured generative models with static images. These methods focus a large amount of their capacity on reconstructing unimportant background pixels, missing low contrast or small objects. Conversely, we present a new method that avoids losses in pixel space and over-reliance on the limited signal a static image provides. Our approach takes advantage of objects' motion by learning a discriminative, time-contrastive loss in the space of slot representations, attempting to force each slot to not only capture entities that move, but capture distinct objects from the other slots. Moreover, we introduce a new quantitative evaluation metric to measure how "diverse" a set of slot vectors are, and use it to evaluate our model on 20 Atari games.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes