LGSep 15, 2025

From Autoencoders to CycleGAN: Robust Unpaired Face Manipulation via Adversarial Learning

arXiv:2509.12176v1
Originality Incremental advance
AI Analysis

This work addresses the problem of generating realistic, identity-preserving face images from unpaired datasets for applications in entertainment and AI, representing an incremental improvement over existing methods.

The paper tackled unpaired face manipulation by developing a guided CycleGAN framework with spectral normalization and identity-preserving losses, which improved realism (FID), perceptual quality (LPIPS), and identity preservation (ID-Sim) over autoencoder baselines.

Human face synthesis and manipulation are increasingly important in entertainment and AI, with a growing demand for highly realistic, identity-preserving images even when only unpaired, unaligned datasets are available. We study unpaired face manipulation via adversarial learning, moving from autoencoder baselines to a robust, guided CycleGAN framework. While autoencoders capture coarse identity, they often miss fine details. Our approach integrates spectral normalization for stable training, identity- and perceptual-guided losses to preserve subject identity and high-level structure, and landmark-weighted cycle constraints to maintain facial geometry across pose and illumination changes. Experiments show that our adversarial trained CycleGAN improves realism (FID), perceptual quality (LPIPS), and identity preservation (ID-Sim) over autoencoders, with competitive cycle-reconstruction SSIM and practical inference times, which achieved high quality without paired datasets and approaching pix2pix on curated paired subsets. These results demonstrate that guided, spectrally normalized CycleGANs provide a practical path from autoencoders to robust unpaired face manipulation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes