CVSep 20, 2024

Data Pruning via Separability, Integrity, and Model Uncertainty-Aware Importance Sampling

arXiv:2409.13915v11 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses efficient dataset reduction for image classification, particularly in fine-grained scenarios, though it appears incremental over existing data pruning methods.

The paper tackles data pruning for image classification by introducing a new pruning metric and procedure based on importance sampling, which accounts for data separability, integrity, and model uncertainty; experiments on four benchmark datasets show it scales well to high pruning ratios and generalizes across models.

This paper improves upon existing data pruning methods for image classification by introducing a novel pruning metric and pruning procedure based on importance sampling. The proposed pruning metric explicitly accounts for data separability, data integrity, and model uncertainty, while the sampling procedure is adaptive to the pruning ratio and considers both intra-class and inter-class separation to further enhance the effectiveness of pruning. Furthermore, the sampling method can readily be applied to other pruning metrics to improve their performance. Overall, the proposed approach scales well to high pruning ratio and generalizes better across different classification models, as demonstrated by experiments on four benchmark datasets, including the fine-grained classification scenario.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes