Object Segmentation Without Labels with Large-Scale Generative Models
This addresses the challenge of reducing dependency on labeled data for computer vision tasks like object segmentation, which is important for researchers and practitioners in AI/vision, though it builds incrementally on prior unsupervised learning advances.
The paper tackles the problem of unsupervised object segmentation without any pixel-level or image-level labels by leveraging large-scale generative models, specifically unsupervised GANs, to differentiate foreground from background pixels and produce high-quality saliency masks. The result is a new state-of-the-art performance on standard benchmarks, outperforming existing unsupervised alternatives.
The recent rise of unsupervised and self-supervised learning has dramatically reduced the dependency on labeled data, providing effective image representations for transfer to downstream vision tasks. Furthermore, recent works employed these representations in a fully unsupervised setup for image classification, reducing the need for human labels on the fine-tuning stage as well. This work demonstrates that large-scale unsupervised models can also perform a more challenging object segmentation task, requiring neither pixel-level nor image-level labeling. Namely, we show that recent unsupervised GANs allow to differentiate between foreground/background pixels, providing high-quality saliency masks. By extensive comparison on standard benchmarks, we outperform existing unsupervised alternatives for object segmentation, achieving new state-of-the-art.