CVSep 21, 2025

PRISM: Precision-Recall Informed Data-Free Knowledge Distillation via Generative Diffusion

arXiv:2509.16897v11 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses the challenge of knowledge transfer without real data for large-scale image tasks, offering an incremental improvement over prior DFKD methods.

The paper tackled the problem of data-free knowledge distillation (DFKD) for large-scale images, where existing methods suffer from mode collapse, by proposing PRISM to address precision-recall challenges in synthetic data generation, resulting in superior performance on various datasets and strong domain generalization.

Data-free knowledge distillation (DFKD) transfers knowledge from a teacher to a student without access to the real in-distribution (ID) data. While existing methods perform well on small-scale images, they suffer from mode collapse when synthesizing large-scale images, resulting in limited knowledge transfer. Recently, leveraging advanced generative models to synthesize photorealistic images has emerged as a promising alternative. Nevertheless, directly using off-the-shelf diffusion to generate datasets faces the precision-recall challenges: 1) ensuring synthetic data aligns with the real distribution, and 2) ensuring coverage of the real ID manifold. In response, we propose PRISM, a precision-recall informed synthesis method. Specifically, we introduce Energy-guided Distribution Alignment to avoid the generation of out-of-distribution samples, and design the Diversified Prompt Engineering to enhance coverage of the real ID manifold. Extensive experiments on various large-scale image datasets demonstrate the superiority of PRISM. Moreover, we demonstrate that models trained with PRISM exhibit strong domain generalization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes