CVLGApr 18, 2022

Inductive Biases for Object-Centric Representations in the Presence of Complex Textures

arXiv:2204.08479v316 citationsh-index: 54
Originality Incremental advance
AI Analysis

This work addresses the challenge of learning object-centric representations from complex natural scenes, which is important for computer vision applications, though it appears incremental in its systematic comparison of existing models.

The paper investigates inductive biases for unsupervised object-centric representation learning in scenes with complex textures, finding that using a single module to reconstruct both shape and appearance improves object separation and that segmentation quality correlates more strongly with downstream usefulness than reconstruction accuracy.

Understanding which inductive biases could be helpful for the unsupervised learning of object-centric representations of natural scenes is challenging. In this paper, we systematically investigate the performance of two models on datasets where neural style transfer was used to obtain objects with complex textures while still retaining ground-truth annotations. We find that by using a single module to reconstruct both the shape and visual appearance of each object, the model learns more useful representations and achieves better object separation. In addition, we observe that adjusting the latent space size is insufficient to improve segmentation performance. Finally, the downstream usefulness of the representations is significantly more strongly correlated with segmentation quality than with reconstruction accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes