CV LGDec 15, 2025

DiRe: Diversity-promoting Regularization for Dataset Condensation

Saumyaranjan Mohanty, Aravind Reddy, Konda Reddy Mopuri

arXiv:2512.13083v23.6

Originality Incremental advance

AI Analysis

This addresses the need for more efficient and diverse condensed datasets in machine learning, though it is incremental as it builds on existing condensation methods.

The paper tackled the problem of redundancy in synthesized datasets for dataset condensation by proposing a diversity-promoting regularizer (DiRe) that improves state-of-the-art methods, achieving enhanced generalization and diversity metrics across benchmarks from CIFAR-10 to ImageNet-1K.

In Dataset Condensation, the goal is to synthesize a small dataset that replicates the training utility of a large original dataset. Existing condensation methods synthesize datasets with significant redundancy, so there is a dire need to reduce redundancy and improve the diversity of the synthesized datasets. To tackle this, we propose an intuitive Diversity Regularizer (DiRe) composed of cosine similarity and Euclidean distance, which can be applied off-the-shelf to various state-of-the-art condensation methods. Through extensive experiments, we demonstrate that the addition of our regularizer improves state-of-the-art condensation methods on various benchmark datasets from CIFAR-10 to ImageNet-1K with respect to generalization and diversity metrics.

View on arXiv PDF

Similar