Pluralistic Aging Diffusion Autoencoder
This addresses the need for realistic and varied face aging simulations in applications like entertainment or forensics, though it is an incremental improvement over existing diffusion-based methods.
The paper tackles the ill-posed problem of face aging by proposing a method to generate multiple plausible aging patterns instead of a single deterministic output, achieving more diverse and high-quality results as demonstrated in experiments.
Face aging is an ill-posed problem because multiple plausible aging patterns may correspond to a given input. Most existing methods often produce one deterministic estimation. This paper proposes a novel CLIP-driven Pluralistic Aging Diffusion Autoencoder (PADA) to enhance the diversity of aging patterns. First, we employ diffusion models to generate diverse low-level aging details via a sequential denoising reverse process. Second, we present Probabilistic Aging Embedding (PAE) to capture diverse high-level aging patterns, which represents age information as probabilistic distributions in the common CLIP latent space. A text-guided KL-divergence loss is designed to guide this learning. Our method can achieve pluralistic face aging conditioned on open-world aging texts and arbitrary unseen face images. Qualitative and quantitative experiments demonstrate that our method can generate more diverse and high-quality plausible aging results.