CVSep 20, 2025

IPF-RDA: An Information-Preserving Framework for Robust Data Augmentation

arXiv:2509.16678v13 citationsh-index: 9Has CodeIEEE Trans Pattern Anal Mach Intell
Originality Incremental advance
AI Analysis

This addresses the issue of unreliable data augmentation for deep learning practitioners, though it is incremental as it builds on existing augmentation methods.

The paper tackles the problem of data augmentation introducing distribution shifts and noise, which can degrade deep network performance, by proposing IPF-RDA, a framework that improves robustness and consistently enhances the performance of state-of-the-art augmentation methods across multiple datasets like CIFAR-10 and Tiny-ImageNet.

Data augmentation is widely utilized as an effective technique to enhance the generalization performance of deep models. However, data augmentation may inevitably introduce distribution shifts and noises, which significantly constrain the potential and deteriorate the performance of deep networks. To this end, we propose a novel information-preserving framework, namely IPF-RDA, to enhance the robustness of data augmentations in this paper. IPF-RDA combines the proposal of (i) a new class-discriminative information estimation algorithm that identifies the points most vulnerable to data augmentation operations and corresponding importance scores; And (ii) a new information-preserving scheme that preserves the critical information in the augmented samples and ensures the diversity of augmented data adaptively. We divide data augmentation methods into three categories according to the operation types and integrate these approaches into our framework accordingly. After being integrated into our framework, the robustness of data augmentation methods can be enhanced and their full potential can be unleashed. Extensive experiments demonstrate that although being simple, IPF-RDA consistently improves the performance of numerous commonly used state-of-the-art data augmentation methods with popular deep models on a variety of datasets, including CIFAR-10, CIFAR-100, Tiny-ImageNet, CUHK03, Market1501, Oxford Flower, and MNIST, where its performance and scalability are stressed. The implementation is available at https://github.com/Jackbrocp/IPF-RDA.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes