CVApr 14, 2019

Biphasic Learning of GANs for High-Resolution Image-to-Image Translation

arXiv:1904.06624v17 citations
Originality Highly original
AI Analysis

This addresses a specific bottleneck in generative models for high-resolution image synthesis, offering incremental improvements in training stability and quality for applications like face editing.

The paper tackles the problem of training instability and poor sample quality in high-resolution image-to-image translation by proposing a biphasic learning framework for GANs, achieving results at 1024^2 resolution and significantly outperforming existing methods in face-related synthesis tasks.

Despite that the performance of image-to-image translation has been significantly improved by recent progress in generative models, current methods still suffer from severe degradation in training stability and sample quality when applied to the high-resolution situation. In this work, we present a novel training framework for GANs, namely biphasic learning, to achieve image-to-image translation in multiple visual domains at $1024^2$ resolution. Our core idea is to design an adjustable objective function that varies across training phases. Within the biphasic learning framework, we propose a novel inherited adversarial loss to achieve the enhancement of model capacity and stabilize the training phase transition. Furthermore, we introduce a perceptual-level consistency loss through mutual information estimation and maximization. To verify the superiority of the proposed method, we apply it to a wide range of face-related synthesis tasks and conduct experiments on multiple large-scale datasets. Through comprehensive quantitative analyses, we demonstrate that our method significantly outperforms existing methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes