CVAug 13, 2020

Shift Equivariance in Object Detection

arXiv:2008.05787v120 citations
Originality Incremental advance
AI Analysis

This addresses robustness issues in object detection for applications like autonomous driving and surveillance, but it is incremental as it focuses on analysis rather than a novel solution.

The paper tackles the problem of shift equivariance in object detection by proposing an evaluation metric to assess model sensitivity to image shifts, finding that modern detectors are sensitive to even one-pixel shifts and that existing solutions fail to provide full equivariance.

Robustness to small image translations is a highly desirable property for object detectors. However, recent works have shown that CNN-based classifiers are not shift invariant. It is unclear to what extent this could impact object detection, mainly because of the architectural differences between the two and the dimensionality of the prediction space of modern detectors. To assess shift equivariance of object detection models end-to-end, in this paper we propose an evaluation metric, built upon a greedy search of the lower and upper bounds of the mean average precision on a shifted image set. Our new metric shows that modern object detection architectures, no matter if one-stage or two-stage, anchor-based or anchor-free, are sensitive to even one pixel shift to the input images. Furthermore, we investigate several possible solutions to this problem, both taken from the literature and newly proposed, quantifying the effectiveness of each one with the suggested metric. Our results indicate that none of these methods can provide full shift equivariance. Measuring and analyzing the extent of shift variance of different models and the contributions of possible factors, is a first step towards being able to devise methods that mitigate or even leverage such variabilities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes