LGMay 3, 2023

A Survey on Dataset Distillation: Approaches, Applications and Future Directions

arXiv:2305.01975v346 citations
Originality Synthesis-oriented
AI Analysis

It provides a comprehensive overview for researchers and practitioners in machine learning, but it is incremental as it synthesizes existing work without introducing new methods.

This survey addresses the lack of a holistic understanding of dataset distillation by proposing a taxonomy and systematically reviewing approaches, data modalities, and applications, such as continual learning and privacy protection.

Dataset distillation is attracting more attention in machine learning as training sets continue to grow and the cost of training state-of-the-art models becomes increasingly high. By synthesizing datasets with high information density, dataset distillation offers a range of potential applications, including support for continual learning, neural architecture search, and privacy protection. Despite recent advances, we lack a holistic understanding of the approaches and applications. Our survey aims to bridge this gap by first proposing a taxonomy of dataset distillation, characterizing existing approaches, and then systematically reviewing the data modalities, and related applications. In addition, we summarize the challenges and discuss future directions for this field of research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes