Humans disagree with the IoU for measuring object detector localization error
This work highlights a potential mismatch between automated metrics and human perception in computer vision, which could impact evaluation practices for object detection systems.
The study investigated whether Intersection over Union (IoU) aligns with human judgment for evaluating object detector localization errors, finding that humans often disagree with IoU scores and express preferences for errors with the same IoU, indicating IoU alone may be insufficient.
The localization quality of automatic object detectors is typically evaluated by the Intersection over Union (IoU) score. In this work, we show that humans have a different view on localization quality. To evaluate this, we conduct a survey with more than 70 participants. Results show that for localization errors with the exact same IoU score, humans might not consider that these errors are equal, and express a preference. Our work is the first to evaluate IoU with humans and makes it clear that relying on IoU scores alone to evaluate localization errors might not be sufficient.