The Role of Noisy Data in Improving CNN Robustness for Image Classification
This addresses the challenge of real-world image noise for CNN users, offering a simple regularization method, though it is incremental as it builds on existing noise-based techniques.
The paper tackled the problem of improving CNN robustness to image corruptions by deliberately adding controlled noise to training data, finding that using just 10% noisy data significantly reduced test loss and enhanced accuracy under corrupted conditions with minimal impact on clean-data performance.
Data quality plays a central role in the performance and robustness of convolutional neural networks (CNNs) for image classification. While high-quality data is often preferred for training, real-world inputs are frequently affected by noise and other distortions. This paper investigates the effect of deliberately introducing controlled noise into the training data to improve model robustness. Using the CIFAR-10 dataset, we evaluate the impact of three common corruptions, namely Gaussian noise, Salt-and-Pepper noise, and Gaussian blur at varying intensities and training set pollution levels. Experiments using a Resnet-18 model reveal that incorporating just 10\% noisy data during training is sufficient to significantly reduce test loss and enhance accuracy under fully corrupted test conditions, with minimal impact on clean-data performance. These findings suggest that strategic exposure to noise can act as a simple yet effective regularizer, offering a practical trade-off between traditional data cleanliness and real-world resilience.