CVAug 29, 2024

Neural Spectral Decomposition for Dataset Distillation

arXiv:2408.16236v116 citationsh-index: 11Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of efficiently compressing datasets for machine learning practitioners, though it appears incremental as it builds on existing distillation methods with a novel decomposition approach.

The paper tackles dataset distillation by proposing Neural Spectrum Decomposition, a framework that treats datasets as low-rank observations and reconstructs data distributions using spectrum tensors and transformation matrices, achieving state-of-the-art performance on benchmarks like CIFAR10, CIFAR100, Tiny Imagenet, and ImageNet Subset.

In this paper, we propose Neural Spectrum Decomposition, a generic decomposition framework for dataset distillation. Unlike previous methods, we consider the entire dataset as a high-dimensional observation that is low-rank across all dimensions. We aim to discover the low-rank representation of the entire dataset and perform distillation efficiently. Toward this end, we learn a set of spectrum tensors and transformation matrices, which, through simple matrix multiplication, reconstruct the data distribution. Specifically, a spectrum tensor can be mapped back to the image space by a transformation matrix, and efficient information sharing during the distillation learning process is achieved through pairwise combinations of different spectrum vectors and transformation matrices. Furthermore, we integrate a trajectory matching optimization method guided by a real distribution. Our experimental results demonstrate that our approach achieves state-of-the-art performance on benchmarks, including CIFAR10, CIFAR100, Tiny Imagenet, and ImageNet Subset. Our code are available at \url{https://github.com/slyang2021/NSD}.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes