CVJan 21

APPLE: Attribute-Preserving Pseudo-Labeling for Diffusion-Based Face Swapping

arXiv:2601.15288v1h-index: 8
Originality Incremental advance
AI Analysis

This work addresses the challenge of preserving target-specific attributes like lighting and skin tone in face swapping, which is important for applications in media and entertainment, though it appears incremental as it builds on existing diffusion-based methods.

The paper tackled the problem of face swapping by proposing APPLE, a diffusion-based teacher-student framework that enhances attribute fidelity through pseudo-label supervision, achieving state-of-the-art performance in attribute preservation and identity transfer with more photorealistic results.

Face swapping aims to transfer the identity of a source face onto a target face while preserving target-specific attributes such as pose, expression, lighting, skin tone, and makeup. However, since real ground truth for face swapping is unavailable, achieving both accurate identity transfer and high-quality attribute preservation remains challenging. In addition, recent diffusion-based approaches attempt to improve visual fidelity through conditional inpainting on masked target images, but the masked condition removes crucial appearance cues of target, resulting in plausible yet misaligned attributes. To address these limitations, we propose APPLE (Attribute-Preserving Pseudo-Labeling), a diffusion-based teacher-student framework that enhances attribute fidelity through attribute-aware pseudo-label supervision. We reformulate face swapping as a conditional deblurring task to more faithfully preserve target-specific attributes such as lighting, skin tone, and makeup. In addition, we introduce an attribute-aware inversion scheme to further improve detailed attribute preservation. Through an elaborate attribute-preserving design for teacher learning, APPLE produces high-quality pseudo triplets that explicitly provide the student with direct face-swapping supervision. Overall, APPLE achieves state-of-the-art performance in terms of attribute preservation and identity transfer, producing more photorealistic and target-faithful results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes