unMORE: Unsupervised Multi-Object Segmentation via Center-Boundary Reasoning
It addresses the problem of segmenting multiple complex objects without human labels for computer vision applications, representing a novel method for a known bottleneck.
The paper tackles unsupervised multi-object segmentation in real-world images by introducing unMORE, a two-stage pipeline that learns object-centric representations and uses a network-free reasoning module, achieving state-of-the-art results on 6 benchmark datasets including COCO and excelling in crowded images where baselines fail.
We study the challenging problem of unsupervised multi-object segmentation on single images. Existing methods, which rely on image reconstruction objectives to learn objectness or leverage pretrained image features to group similar pixels, often succeed only in segmenting simple synthetic objects or discovering a limited number of real-world objects. In this paper, we introduce unMORE, a novel two-stage pipeline designed to identify many complex objects in real-world images. The key to our approach involves explicitly learning three levels of carefully defined object-centric representations in the first stage. Subsequently, our multi-object reasoning module utilizes these learned object priors to discover multiple objects in the second stage. Notably, this reasoning module is entirely network-free and does not require human labels. Extensive experiments demonstrate that unMORE significantly outperforms all existing unsupervised methods across 6 real-world benchmark datasets, including the challenging COCO dataset, achieving state-of-the-art object segmentation results. Remarkably, our method excels in crowded images where all baselines collapse.