CVSep 18, 2019

Extremely Weak Supervised Image-to-Image Translation for Semantic Segmentation

Samarth Shukla, Luc Van Gool, Radu Timofte

arXiv:1909.08542v15.411 citations

Originality Incremental advance

AI Analysis

This addresses the data efficiency problem for researchers and practitioners in computer vision by offering a more practical solution than fully supervised or unsupervised methods.

The paper tackles the problem of costly paired training data in image-to-image translation for semantic segmentation by proposing a method to select very few paired samples, achieving performance comparable to supervised models with thousands of pairs using only one paired sample.

Recent advances in generative models and adversarial training have led to a flourishing image-to-image (I2I) translation literature. The current I2I translation approaches require training images from the two domains that are either all paired (supervised) or all unpaired (unsupervised). In practice, obtaining paired training data in sufficient quantities is often very costly and cumbersome. Therefore solutions that employ unpaired data, while less accurate, are largely preferred. In this paper, we aim to bridge the gap between supervised and unsupervised I2I translation, with application to semantic image segmentation. We build upon pix2pix and CycleGAN, state-of-the-art seminal I2I translation techniques. We propose a method to select (very few) paired training samples and achieve significant improvements in both supervised and unsupervised I2I translation settings over random selection. Further, we boost the performance by incorporating both (selected) paired and unpaired samples in the training process. Our experiments show that an extremely weak supervised I2I translation solution using only one paired training sample can achieve a quantitative performance much better than the unsupervised CycleGAN model, and comparable to that of the supervised pix2pix model trained on thousands of pairs.

View on arXiv PDF

Similar