CVJul 6, 2022

DCT-Net: Domain-Calibrated Translation for Portrait Stylization

arXiv:2207.02426v111 citationsh-index: 59
Originality Incremental advance
AI Analysis

This addresses the problem of overfitting in few-shot style transfer for portrait images, offering a domain-specific solution for artists and designers.

The paper tackles few-shot portrait stylization by introducing DCT-Net, which uses a 'calibration first, translation later' approach to avoid overfitting with limited style exemplars (~100), achieving high-quality style transfer and full-body translation with adaptive deformations.

This paper introduces DCT-Net, a novel image translation architecture for few-shot portrait stylization. Given limited style exemplars ($\sim$100), the new architecture can produce high-quality style transfer results with advanced ability to synthesize high-fidelity contents and strong generality to handle complicated scenes (e.g., occlusions and accessories). Moreover, it enables full-body image translation via one elegant evaluation network trained by partial observations (i.e., stylized heads). Few-shot learning based style transfer is challenging since the learned model can easily become overfitted in the target domain, due to the biased distribution formed by only a few training examples. This paper aims to handle the challenge by adopting the key idea of "calibration first, translation later" and exploring the augmented global structure with locally-focused translation. Specifically, the proposed DCT-Net consists of three modules: a content adapter borrowing the powerful prior from source photos to calibrate the content distribution of target samples; a geometry expansion module using affine transformations to release spatially semantic constraints; and a texture translation module leveraging samples produced by the calibrated distribution to learn a fine-grained conversion. Experimental results demonstrate the proposed method's superiority over the state of the art in head stylization and its effectiveness on full image translation with adaptive deformations.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes