CVSep 21, 2025

PRISM: Precision-Recall Informed Data-Free Knowledge Distillation via Generative Diffusion

Xuewan He, Jielei Wang, Zihan Cheng, Yuchen Su, Shiyue Huang, Guoming Lu

arXiv:2509.16897v13.61 citationsh-index: 3

Originality Incremental advance

AI Analysis

This work addresses the challenge of knowledge transfer without real data for large-scale image tasks, offering an incremental improvement over prior DFKD methods.

The paper tackled the problem of data-free knowledge distillation (DFKD) for large-scale images, where existing methods suffer from mode collapse, by proposing PRISM to address precision-recall challenges in synthetic data generation, resulting in superior performance on various datasets and strong domain generalization.

Data-free knowledge distillation (DFKD) transfers knowledge from a teacher to a student without access to the real in-distribution (ID) data. While existing methods perform well on small-scale images, they suffer from mode collapse when synthesizing large-scale images, resulting in limited knowledge transfer. Recently, leveraging advanced generative models to synthesize photorealistic images has emerged as a promising alternative. Nevertheless, directly using off-the-shelf diffusion to generate datasets faces the precision-recall challenges: 1) ensuring synthetic data aligns with the real distribution, and 2) ensuring coverage of the real ID manifold. In response, we propose PRISM, a precision-recall informed synthesis method. Specifically, we introduce Energy-guided Distribution Alignment to avoid the generation of out-of-distribution samples, and design the Diversified Prompt Engineering to enhance coverage of the real ID manifold. Extensive experiments on various large-scale image datasets demonstrate the superiority of PRISM. Moreover, we demonstrate that models trained with PRISM exhibit strong domain generalization.

View on arXiv PDF

Similar