CVAINov 25, 2022

Combating noisy labels in object detection datasets

arXiv:2211.13993v37 citationsh-index: 18
Originality Incremental advance
AI Analysis

This addresses the problem of dataset quality for object detection practitioners, offering a method to clean datasets rather than accept errors, though it is incremental as it builds on existing confident learning approaches.

The paper tackles noisy labels in object detection datasets by proposing the CLOD algorithm to identify and correct errors like missing or mislocated bounding boxes, improving mAP scores by 16% to 46% after cleaning without modifying network architectures.

The quality of training datasets for deep neural networks is a key factor contributing to the accuracy of resulting models. This effect is amplified in difficult tasks such as object detection. Dealing with errors in datasets is often limited to accepting that some fraction of examples are incorrect, estimating their confidence, and either assigning appropriate weights or ignoring uncertain ones during training. In this work, we propose a different approach. We introduce the Confident Learning for Object Detection (CLOD) algorithm for assessing the quality of each label in object detection datasets, identifying missing, spurious, mislabeled, and mislocated bounding boxes and suggesting corrections. By focusing on finding incorrect examples in the training datasets, we can eliminate them at the root. Suspicious bounding boxes can be reviewed to improve the quality of the dataset, leading to better models without further complicating their already complex architectures. The proposed method is able to point out nearly 80% of artificially disturbed bounding boxes with a false positive rate below 0.1. Cleaning the datasets by applying the most confident automatic suggestions improved mAP scores by 16% to 46%, depending on the dataset, without any modifications to the network architectures. This approach shows promising potential in rectifying state-of-the-art object detection datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes