Compositor: Bottom-up Clustering and Compositing for Robust Part and Object Segmentation
This work addresses robust segmentation for computer vision applications, offering incremental improvements over existing methods.
The paper tackles joint part and object segmentation by reformulating it as an optimization problem with hierarchical embeddings, achieving state-of-the-art performance with improvements of 0.9-1.7% in mIoU and 4.4-7.1% in robustness against occlusion on PartImageNet and Pascal-Part datasets.
In this work, we present a robust approach for joint part and object segmentation. Specifically, we reformulate object and part segmentation as an optimization problem and build a hierarchical feature representation including pixel, part, and object-level embeddings to solve it in a bottom-up clustering manner. Pixels are grouped into several clusters where the part-level embeddings serve as cluster centers. Afterwards, object masks are obtained by compositing the part proposals. This bottom-up interaction is shown to be effective in integrating information from lower semantic levels to higher semantic levels. Based on that, our novel approach Compositor produces part and object segmentation masks simultaneously while improving the mask quality. Compositor achieves state-of-the-art performance on PartImageNet and Pascal-Part by outperforming previous methods by around 0.9% and 1.3% on PartImageNet, 0.4% and 1.7% on Pascal-Part in terms of part and object mIoU and demonstrates better robustness against occlusion by around 4.4% and 7.1% on part and object respectively. Code will be available at https://github.com/TACJu/Compositor.