LGAICVNov 13, 2025

PRISM: Diversifying Dataset Distillation by Decoupling Architectural Priors

arXiv:2511.09905v1h-index: 13Trans. Mach. Learn. Res.
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in dataset distillation for machine learning practitioners by improving synthetic data diversity and generalization, though it is incremental as it builds on multi-teacher approaches.

The paper tackles the problem of dataset distillation inheriting inductive bias from a single teacher model, which reduces intra-class diversity and generalization, by presenting PRISM, a framework that decouples architectural priors using diverse teacher models, resulting in outperforming existing methods on ImageNet-1K and generating data with richer intra-class diversity.

Dataset distillation (DD) promises compact yet faithful synthetic data, but existing approaches often inherit the inductive bias of a single teacher model. As dataset size increases, this bias drives generation toward overly smooth, homogeneous samples, reducing intra-class diversity and limiting generalization. We present PRISM (PRIors from diverse Source Models), a framework that disentangles architectural priors during synthesis. PRISM decouples the logit-matching and regularization objectives, supervising them with different teacher architectures: a primary model for logits and a stochastic subset for batch-normalization (BN) alignment. On ImageNet-1K, PRISM consistently and reproducibly outperforms single-teacher methods (e.g., SRe2L) and recent multi-teacher variants (e.g., G-VBSM) at low- and mid-IPC regimes. The generated data also show significantly richer intra-class diversity, as reflected by a notable drop in cosine similarity between features. We further analyze teacher selection strategies (pre- vs. intra-distillation) and introduce a scalable cross-class batch formation scheme for fast parallel synthesis. Code will be released after the review period.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes