CVSep 18, 2019

Extremely Weak Supervised Image-to-Image Translation for Semantic Segmentation

arXiv:1909.08542v111 citations
Originality Incremental advance
AI Analysis

This addresses the data efficiency problem for researchers and practitioners in computer vision by offering a more practical solution than fully supervised or unsupervised methods.

The paper tackles the problem of costly paired training data in image-to-image translation for semantic segmentation by proposing a method to select very few paired samples, achieving performance comparable to supervised models with thousands of pairs using only one paired sample.

Recent advances in generative models and adversarial training have led to a flourishing image-to-image (I2I) translation literature. The current I2I translation approaches require training images from the two domains that are either all paired (supervised) or all unpaired (unsupervised). In practice, obtaining paired training data in sufficient quantities is often very costly and cumbersome. Therefore solutions that employ unpaired data, while less accurate, are largely preferred. In this paper, we aim to bridge the gap between supervised and unsupervised I2I translation, with application to semantic image segmentation. We build upon pix2pix and CycleGAN, state-of-the-art seminal I2I translation techniques. We propose a method to select (very few) paired training samples and achieve significant improvements in both supervised and unsupervised I2I translation settings over random selection. Further, we boost the performance by incorporating both (selected) paired and unpaired samples in the training process. Our experiments show that an extremely weak supervised I2I translation solution using only one paired training sample can achieve a quantitative performance much better than the unsupervised CycleGAN model, and comparable to that of the supervised pix2pix model trained on thousands of pairs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes