CVApr 4, 2025

Finding the Reflection Point: Unpadding Images to Remove Data Augmentation Artifacts in Large Open Source Image Datasets for Machine Learning

arXiv:2504.03168v13.6h-index: 8MIPR

Originality Incremental advance

AI Analysis

This addresses dataset integrity issues for computer vision researchers and practitioners by removing artifacts that cause annotation inconsistencies and distorted objects, enabling more reliable model evaluation across tasks.

The paper tackled the problem of detecting and removing noisy mirrored padding artifacts in images from large open-source datasets, which degrade model evaluation when datasets are repurposed across domains. The result was a systematic algorithm that improved zero-shot object detection performance, with average precision increasing from 0.47 to 0.61 for hard hat detection and from 0.68 to 0.73 for person detection on the SHEL5k dataset.

In this paper, we address a novel image restoration problem relevant to machine learning dataset curation: the detection and removal of noisy mirrored padding artifacts. While data augmentation techniques like padding are necessary for standardizing image dimensions, they can introduce artifacts that degrade model evaluation when datasets are repurposed across domains. We propose a systematic algorithm to precisely delineate the reflection boundary through a minimum mean squared error approach with thresholding and remove reflective padding. Our method effectively identifies the transition between authentic content and its mirrored counterpart, even in the presence of compression or interpolation noise. We demonstrate our algorithm's efficacy on the SHEL5k dataset, showing significant performance improvements in zero-shot object detection tasks using OWLv2, with average precision increasing from 0.47 to 0.61 for hard hat detection and from 0.68 to 0.73 for person detection. By addressing annotation inconsistencies and distorted objects in padded regions, our approach enhances dataset integrity, enabling more reliable model evaluation across computer vision tasks.

View on arXiv PDF

Similar