CVDec 13, 2022

HS-Diffusion: Semantic-Mixing Diffusion for Head Swapping

Qinghe Wang, Lijie Liu, Miao Hua, Pengfei Zhu, Wangmeng Zuo, Qinghua Hu, Huchuan Lu, Bing Cao

arXiv:2212.06458v312.213 citationsh-index: 105Has Code

Originality Incremental advance

AI Analysis

This addresses a seldom-studied problem in image editing for generating realistic head swaps, though it appears incremental as it builds on existing diffusion models.

The paper tackles the head swapping task by proposing HS-Diffusion, a semantic-mixing diffusion model that blends head and body layouts and inpaints transition regions, achieving high-quality reconstructions with new metrics like Mask-FID and Focal-FID showing superior performance.

Image-based head swapping task aims to stitch a source head to another source body flawlessly. This seldom-studied task faces two major challenges: 1) Preserving the head and body from various sources while generating a seamless transition region. 2) No paired head swapping dataset and benchmark so far. In this paper, we propose a semantic-mixing diffusion model for head swapping (HS-Diffusion) which consists of a latent diffusion model (LDM) and a semantic layout generator. We blend the semantic layouts of source head and source body, and then inpaint the transition region by the semantic layout generator, achieving a coarse-grained head swapping. Semantic-mixing LDM can further implement a fine-grained head swapping with the inpainted layout as condition by a progressive fusion process, while preserving head and body with high-quality reconstruction. To this end, we propose a semantic calibration strategy for natural inpainting and a neck alignment for geometric realism. Importantly, we construct a new image-based head swapping benchmark and design two tailor-designed metrics (Mask-FID and Focal-FID). Extensive experiments demonstrate the superiority of our framework. The code will be available: https://github.com/qinghew/HS-Diffusion.

View on arXiv PDF Code

Similar