CVOct 28, 2025

Beyond Inference Intervention: Identity-Decoupled Diffusion for Face Anonymization

Haoxin Yang, Yihong Lin, Jingdan Kang, Xuemiao Xu, Yue Li, Cheng Xu, Shengfeng He

arXiv:2510.24213v11 citationsh-index: 8

Originality Highly original

AI Analysis

This addresses the problem of preserving privacy in facial data for applications like surveillance or social media, offering a more effective solution than incremental methods by reducing distribution shifts and attribute entanglement.

The paper tackles face anonymization by proposing ID^2Face, a training-centric framework that learns a disentangled latent space to separate identity and non-identity attributes, enabling direct anonymization without inference-time interventions. It outperforms existing methods in visual quality, identity suppression, and utility preservation, with experiments showing concrete improvements.

Face anonymization aims to conceal identity information while preserving non-identity attributes. Mainstream diffusion models rely on inference-time interventions such as negative guidance or energy-based optimization, which are applied post-training to suppress identity features. These interventions often introduce distribution shifts and entangle identity with non-identity attributes, degrading visual fidelity and data utility. To address this, we propose \textbf{ID\textsuperscript{2}Face}, a training-centric anonymization framework that removes the need for inference-time optimization. The rationale of our method is to learn a structured latent space where identity and non-identity information are explicitly disentangled, enabling direct and controllable anonymization at inference. To this end, we design a conditional diffusion model with an identity-masked learning scheme. An Identity-Decoupled Latent Recomposer uses an Identity Variational Autoencoder to model identity features, while non-identity attributes are extracted from same-identity pairs and aligned through bidirectional latent alignment. An Identity-Guided Latent Harmonizer then fuses these representations via soft-gating conditioned on noisy feature prediction. The model is trained with a recomposition-based reconstruction loss to enforce disentanglement. At inference, anonymization is achieved by sampling a random identity vector from the learned identity space. To further suppress identity leakage, we introduce an Orthogonal Identity Mapping strategy that enforces orthogonality between sampled and source identity vectors. Experiments demonstrate that ID\textsuperscript{2}Face outperforms existing methods in visual quality, identity suppression, and utility preservation.

View on arXiv PDF

Similar