LGMay 14, 2025

Self-Consuming Generative Models with Adversarially Curated Data

arXiv:2505.09768v115.79 citationsh-index: 2ICML

Originality Incremental advance

AI Analysis

This addresses the stability and security of generative models in real-world applications where data curation is imperfect or malicious, which is an incremental but important extension of prior work on self-consuming loops.

The paper tackles the problem of generative models evolving under self-consuming retraining loops with noisy and adversarially curated data, showing that such curation can disrupt models and designing effective attack algorithms for competitive scenarios.

Recent advances in generative models have made it increasingly difficult to distinguish real data from model-generated synthetic data. Using synthetic data for successive training of future model generations creates "self-consuming loops", which may lead to model collapse or training instability. Furthermore, synthetic data is often subject to human feedback and curated by users based on their preferences. Ferbach et al. (2024) recently showed that when data is curated according to user preferences, the self-consuming retraining loop drives the model to converge toward a distribution that optimizes those preferences. However, in practice, data curation is often noisy or adversarially manipulated. For example, competing platforms may recruit malicious users to adversarially curate data and disrupt rival models. In this paper, we study how generative models evolve under self-consuming retraining loops with noisy and adversarially curated data. We theoretically analyze the impact of such noisy data curation on generative models and identify conditions for the robustness of the retraining process. Building on this analysis, we design attack algorithms for competitive adversarial scenarios, where a platform with a limited budget employs malicious users to misalign a rival's model from actual user preferences. Experiments on both synthetic and real-world datasets demonstrate the effectiveness of the proposed algorithms.

View on arXiv PDF

Similar