CVSep 5, 2024

Organized Grouped Discrete Representation for Object-Centric Learning

arXiv:2409.03553v41 citationsh-index: 45Has Code
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in object-centric learning for computer vision researchers, offering an incremental improvement over existing methods.

The paper tackles the problem of suboptimal attribute grouping in Grouped Discrete Representation for object-centric learning, which leads to information loss, by proposing Organized GDR that organizes channels by attributes to improve decomposition. The result shows OGDR fully surpasses GDR in unsupervised segmentation, enhancing both transformer-based and state-of-the-art diffusion-based methods, with analyses confirming better redundancy elimination and information preservation.

Object-Centric Learning (OCL) represents dense image or video pixels as sparse object features. Representative methods utilize discrete representation composed of Variational Autoencoder (VAE) template features to suppress pixel-level information redundancy and guide object-level feature aggregation. The most recent advancement, Grouped Discrete Representation (GDR), further decomposes these template features into attributes. However, its naive channel grouping as decomposition may erroneously group channels belonging to different attributes together and discretize them as sub-optimal template attributes, which losses information and harms expressivity. We propose Organized GDR (OGDR) to organize channels belonging to the same attributes together for correct decomposition from features into attributes. In unsupervised segmentation experiments, OGDR is fully superior to GDR in augmentating classical transformer-based OCL methods; it even improves state-of-the-art diffusion-based ones. Codebook PCA and representation similarity analyses show that compared with GDR, our OGDR eliminates redundancy and preserves information better for guiding object representation learning. The source code is available in the supplementary material.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes