Gaussian Bounding Boxes and Probabilistic Intersection-over-Union for Object Detection
This work addresses object detection accuracy by improving bounding box representations, though it is incremental as it builds on existing detection frameworks.
The paper tackles the problem of representing object shapes and locations in object detection by proposing Gaussian distributions as fuzzy bounding boxes and a Probabilistic Intersection-over-Union (ProbIoU) similarity measure based on Hellinger Distance. The results show that Gaussian representations are closer to annotated segmentation masks in datasets, and ProbIoU-based losses can regress Gaussian parameters effectively.
Most object detection methods use bounding boxes to encode and represent the object shape and location. In this work, we explore a fuzzy representation of object regions using Gaussian distributions, which provides an implicit binary representation as (potentially rotated) ellipses. We also present a similarity measure for the Gaussian distributions based on the Hellinger Distance, which can be viewed as a Probabilistic Intersection-over-Union (ProbIoU). Our experimental results show that the proposed Gaussian representations are closer to annotated segmentation masks in publicly available datasets, and that loss functions based on ProbIoU can be successfully used to regress the parameters of the Gaussian representation. Furthermore, we present a simple mapping scheme from traditional (or rotated) bounding boxes to Gaussian representations, allowing the proposed ProbIoU-based losses to be seamlessly integrated into any object detector.