Unsupervised Image-to-Image Translation with Generative Prior
This addresses a key bottleneck in image translation for computer vision applications, though it is incremental as it builds on existing GAN-based methods.
The paper tackles the problem of unsupervised image-to-image translation between domains with drastic visual discrepancies by leveraging generative priors from pre-trained GANs, resulting in improved quality and versatility, with experiments showing superiority over state-of-the-art methods in challenging scenarios.
Unsupervised image-to-image translation aims to learn the translation between two visual domains without paired data. Despite the recent progress in image translation models, it remains challenging to build mappings between complex domains with drastic visual discrepancies. In this work, we present a novel framework, Generative Prior-guided UNsupervised Image-to-image Translation (GP-UNIT), to improve the overall quality and applicability of the translation algorithm. Our key insight is to leverage the generative prior from pre-trained class-conditional GANs (e.g., BigGAN) to learn rich content correspondences across various domains. We propose a novel coarse-to-fine scheme: we first distill the generative prior to capture a robust coarse-level content representation that can link objects at an abstract semantic level, based on which fine-level content features are adaptively learned for more accurate multi-level content correspondences. Extensive experiments demonstrate the superiority of our versatile framework over state-of-the-art methods in robust, high-quality and diversified translations, even for challenging and distant domains.