CVDec 13, 2022

HS-Diffusion: Semantic-Mixing Diffusion for Head Swapping

arXiv:2212.06458v313 citationsh-index: 105Has Code
AI Analysis

This addresses a seldom-studied problem in image editing for generating realistic head swaps, though it appears incremental as it builds on existing diffusion models.

The paper tackles the head swapping task by proposing HS-Diffusion, a semantic-mixing diffusion model that blends head and body layouts and inpaints transition regions, achieving high-quality reconstructions with new metrics like Mask-FID and Focal-FID showing superior performance.

Image-based head swapping task aims to stitch a source head to another source body flawlessly. This seldom-studied task faces two major challenges: 1) Preserving the head and body from various sources while generating a seamless transition region. 2) No paired head swapping dataset and benchmark so far. In this paper, we propose a semantic-mixing diffusion model for head swapping (HS-Diffusion) which consists of a latent diffusion model (LDM) and a semantic layout generator. We blend the semantic layouts of source head and source body, and then inpaint the transition region by the semantic layout generator, achieving a coarse-grained head swapping. Semantic-mixing LDM can further implement a fine-grained head swapping with the inpainted layout as condition by a progressive fusion process, while preserving head and body with high-quality reconstruction. To this end, we propose a semantic calibration strategy for natural inpainting and a neck alignment for geometric realism. Importantly, we construct a new image-based head swapping benchmark and design two tailor-designed metrics (Mask-FID and Focal-FID). Extensive experiments demonstrate the superiority of our framework. The code will be available: https://github.com/qinghew/HS-Diffusion.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes