CVNov 26, 2017

In2I : Unsupervised Multi-Image-to-Image Translation Using Generative Adversarial Networks

arXiv:1711.09334v130 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of generating high-quality translated images from multiple modalities for applications in computer vision, though it is incremental as it builds on existing unsupervised translation methods.

The paper tackles the problem of unsupervised image-to-image translation by extending it to multiple input images, using a GAN-based framework with a multi-modal generator and latent consistency loss, resulting in improved visual quality and outperforming state-of-the-art methods.

In unsupervised image-to-image translation, the goal is to learn the mapping between an input image and an output image using a set of unpaired training images. In this paper, we propose an extension of the unsupervised image-to-image translation problem to multiple input setting. Given a set of paired images from multiple modalities, a transformation is learned to translate the input into a specified domain. For this purpose, we introduce a Generative Adversarial Network (GAN) based framework along with a multi-modal generator structure and a new loss term, latent consistency loss. Through various experiments we show that leveraging multiple inputs generally improves the visual quality of the translated images. Moreover, we show that the proposed method outperforms current state-of-the-art unsupervised image-to-image translation methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes