CVNov 22, 2017

W-Net: A Deep Model for Fully Unsupervised Image Segmentation

arXiv:1711.08506v125.3280 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of unsupervised image segmentation for domains lacking pixel-level labels, offering a novel deep learning approach.

The paper tackles the problem of unsupervised image segmentation by proposing W-Net, a deep architecture that combines two fully convolutional networks into an autoencoder to minimize reconstruction error and normalized cut, achieving impressive results on the Berkeley Segmentation Data Set and outperforming competing methods.

While significant attention has been recently focused on designing supervised deep semantic segmentation algorithms for vision tasks, there are many domains in which sufficient supervised pixel-level labels are difficult to obtain. In this paper, we revisit the problem of purely unsupervised image segmentation and propose a novel deep architecture for this problem. We borrow recent ideas from supervised semantic segmentation methods, in particular by concatenating two fully convolutional networks together into an autoencoder--one for encoding and one for decoding. The encoding layer produces a k-way pixelwise prediction, and both the reconstruction error of the autoencoder as well as the normalized cut produced by the encoder are jointly minimized during training. When combined with suitable postprocessing involving conditional random field smoothing and hierarchical segmentation, our resulting algorithm achieves impressive results on the benchmark Berkeley Segmentation Data Set, outperforming a number of competing methods.

View on arXiv PDF

Similar