CVAug 17, 2024

FPGA: Flexible Portrait Generation Approach

arXiv:2408.09248v32 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses portrait fidelity generation for applications like photo editing and face-swapping, but it is incremental as it builds on existing diffusion-based methods.

The paper tackles the problem of generating full-body portrait images with low-resolution faces and multi-ID issues by proposing FPGA, a system that includes a training strategy and inference framework, achieving significant advantages in metrics and accelerating inference to under 2.5 seconds on a single GPU.

Portrait Fidelity Generation is a prominent research area in generative models.Current methods face challenges in generating full-body images with low-resolution faces, especially in multi-ID photo phenomenon.To tackle these issues, we propose a comprehensive system called FPGA and construct a million-level multi-modal dataset IDZoom for training.FPGA consists of Multi-Mode Fusion training strategy (MMF) and DDIM Inversion based ID Restoration inference framework (DIIR). The MMF aims to activate the specified ID in the specified facial region. The DIIR aims to address the issue of face artifacts while keeping the background.Furthermore, DIIR is plug-and-play and can be applied to any diffusion-based portrait generation method to enhance their performance. DIIR is also capable of performing face-swapping tasks and is applicable to stylized faces as well.To validate the effectiveness of FPGA, we conducted extensive comparative and ablation experiments. The experimental results demonstrate that FPGA has significant advantages in both subjective and objective metrics, and achieves controllable generation in multi-ID scenarios. In addition, we accelerate the inference speed to within 2.5 seconds on a single L20 graphics card mainly based on our well designed reparameterization method, RepControlNet.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes