New Properties of the Data Distillation Method When Working With Tabular Data
This work addresses data efficiency challenges in tabular data processing, though it is incremental as it adapts an existing method to a new domain.
The paper investigates applying a data distillation algorithm, originally designed for images, to tabular data, finding that models trained on distilled samples can outperform those on the original dataset, but the distilled data generalizes poorly across models with different hyperparameters, which is mitigated by using multiple architectures during distillation.
Data distillation is the problem of reducing the volume oftraining data while keeping only the necessary information. With thispaper, we deeper explore the new data distillation algorithm, previouslydesigned for image data. Our experiments with tabular data show thatthe model trained on distilled samples can outperform the model trainedon the original dataset. One of the problems of the considered algorithmis that produced data has poor generalization on models with differenthyperparameters. We show that using multiple architectures during distillation can help overcome this problem.