CVLGMar 3, 2025

Understanding Dataset Distillation via Spectral Filtering

arXiv:2503.01212v13 citationsh-index: 12
Originality Incremental advance
AI Analysis

This work provides a foundational framework for dataset distillation, benefiting researchers by unifying diverse methods and improving compression efficiency, though it is incremental in advancing existing techniques.

The paper tackles the lack of a unified understanding in dataset distillation by proposing UniDD, a spectral filtering framework that interprets methods as filter functions affecting feature correlations, and introduces Curriculum Frequency Matching (CFM) to dynamically adjust filters, achieving superior performance on datasets like CIFAR-10/100 and ImageNet-1K.

Dataset distillation (DD) has emerged as a promising approach to compress datasets and speed up model training. However, the underlying connections among various DD methods remain largely unexplored. In this paper, we introduce UniDD, a spectral filtering framework that unifies diverse DD objectives. UniDD interprets each DD objective as a specific filter function that affects the eigenvalues of the feature-feature correlation (FFC) matrix and modulates the frequency components of the feature-label correlation (FLC) matrix. In this way, UniDD reveals that the essence of DD fundamentally lies in matching frequency-specific features. Moreover, according to the filter behaviors, we classify existing methods into low-frequency matching and high-frequency matching, encoding global texture and local details, respectively. However, existing methods rely on fixed filter functions throughout distillation, which cannot capture the low- and high-frequency information simultaneously. To address this limitation, we further propose Curriculum Frequency Matching (CFM), which gradually adjusts the filter parameter to cover both low- and high-frequency information of the FFC and FLC matrices. Extensive experiments on small-scale datasets, such as CIFAR-10/100, and large-scale datasets, including ImageNet-1K, demonstrate the superior performance of CFM over existing baselines and validate the practicality of UniDD.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes