LGSep 24, 2025

Generative Model Inversion Through the Lens of the Manifold Hypothesis

Xiong Peng, Bo Han, Fengfei Yu, Tongliang Liu, Feng Liu, Mingyuan Zhou

arXiv:2509.20177v1h-index: 19

Originality Highly original

AI Analysis

This addresses security vulnerabilities in machine learning models for privacy-sensitive applications, offering both defensive insights and enhanced attack methods.

The paper investigates why generative model inversion attacks are effective, finding that they work by projecting noisy gradients onto the generator manifold's tangent space. It validates that models are more vulnerable when loss gradients align with this manifold, and introduces both a training objective and a training-free method that improve state-of-the-art attacks.

Model inversion attacks (MIAs) aim to reconstruct class-representative samples from trained models. Recent generative MIAs utilize generative adversarial networks to learn image priors that guide the inversion process, yielding reconstructions with high visual quality and strong fidelity to the private training data. To explore the reason behind their effectiveness, we begin by examining the gradients of inversion loss with respect to synthetic inputs, and find that these gradients are surprisingly noisy. Further analysis reveals that generative inversion implicitly denoises these gradients by projecting them onto the tangent space of the generator manifold, filtering out off-manifold components while preserving informative directions aligned with the manifold. Our empirical measurements show that, in models trained with standard supervision, loss gradients often exhibit large angular deviations from the data manifold, indicating poor alignment with class-relevant directions. This observation motivates our central hypothesis: models become more vulnerable to MIAs when their loss gradients align more closely with the generator manifold. We validate this hypothesis by designing a novel training objective that explicitly promotes such alignment. Building on this insight, we further introduce a training-free approach to enhance gradient-manifold alignment during inversion, leading to consistent improvements over state-of-the-art generative MIAs.

View on arXiv PDF

Similar