Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression
This addresses a key bottleneck in object detection for computer vision researchers and practitioners, offering an incremental but practical enhancement to existing frameworks.
The paper tackles the gap between optimizing distance losses and maximizing Intersection over Union (IoU) in object detection by introducing Generalized IoU (GIoU) as both a metric and loss, showing consistent performance improvements on benchmarks like PASCAL VOC and MS COCO.
Intersection over Union (IoU) is the most popular evaluation metric used in the object detection benchmarks. However, there is a gap between optimizing the commonly used distance losses for regressing the parameters of a bounding box and maximizing this metric value. The optimal objective for a metric is the metric itself. In the case of axis-aligned 2D bounding boxes, it can be shown that $IoU$ can be directly used as a regression loss. However, $IoU$ has a plateau making it infeasible to optimize in the case of non-overlapping bounding boxes. In this paper, we address the weaknesses of $IoU$ by introducing a generalized version as both a new loss and a new metric. By incorporating this generalized $IoU$ ($GIoU$) as a loss into the state-of-the art object detection frameworks, we show a consistent improvement on their performance using both the standard, $IoU$ based, and new, $GIoU$ based, performance measures on popular object detection benchmarks such as PASCAL VOC and MS COCO.