CVAIJun 28, 2023

A systematic study of the foreground-background imbalance problem in deep learning for object detection

arXiv:2306.16539v16 citationsh-index: 13
Originality Incremental advance
AI Analysis

This addresses a common but understudied issue in object detection for computer vision applications, providing incremental insights through comprehensive benchmarking.

The paper systematically analyzes the foreground-background imbalance problem in object detection, finding it causes significant performance drops, especially with less training data, and identifies Libra-RCNN and PISA as top-performing methods among those compared.

The class imbalance problem in deep learning has been explored in several studies, but there has yet to be a systematic analysis of this phenomenon in object detection. Here, we present comprehensive analyses and experiments of the foreground-background (F-B) imbalance problem in object detection, which is very common and caused by small, infrequent objects of interest. We experimentally study the effects of different aspects of F-B imbalance (object size, number of objects, dataset size, object type) on detection performance. In addition, we also compare 9 leading methods for addressing this problem, including Faster-RCNN, SSD, OHEM, Libra-RCNN, Focal-Loss, GHM, PISA, YOLO-v3, and GFL with a range of datasets from different imaging domains. We conclude that (1) the F-B imbalance can indeed cause a significant drop in detection performance, (2) The detection performance is more affected by F-B imbalance when fewer training data are available, (3) in most cases, decreasing object size leads to larger performance drop than decreasing number of objects, given the same change in the ratio of object pixels to non-object pixels, (6) among all selected methods, Libra-RCNN and PISA demonstrate the best performance in addressing the issue of F-B imbalance. (7) When the training dataset size is large, the choice of method is not impactful (8) Soft-sampling methods, including focal-loss, GHM, and GFL, perform fairly well on average but are relatively unstable.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes