LGAINEMLMar 16, 2022

Learning to Generate Synthetic Training Data using Gradient Matching and Implicit Differentiation

arXiv:2203.08559v112 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work addresses the cost and inconvenience of large datasets for deep learning practitioners, but it is incremental as it builds on existing data distillation ideas.

The paper tackles the problem of reducing the need for large training datasets by proposing new data distillation techniques based on gradient matching and implicit differentiation, showing that these methods are computationally more efficient and improve model performance on distilled MNIST data.

Using huge training datasets can be costly and inconvenient. This article explores various data distillation techniques that can reduce the amount of data required to successfully train deep networks. Inspired by recent ideas, we suggest new data distillation techniques based on generative teaching networks, gradient matching, and the Implicit Function Theorem. Experiments with the MNIST image classification problem show that the new methods are computationally more efficient than previous ones and allow to increase the performance of models trained on distilled data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes