Mind the Pad -- CNNs can Develop Blind Spots
This addresses a subtle but impactful architectural flaw in CNNs that can cause misdetections, particularly for small objects, though it is incremental in nature.
The paper identifies that convolutional neural networks develop spatial bias due to uneven padding mechanisms, which suppresses activations in certain areas and harms tasks like small object detection, and proposes solutions that improve model accuracy.
We show how feature maps in convolutional networks are susceptible to spatial bias. Due to a combination of architectural choices, the activation at certain locations is systematically elevated or weakened. The major source of this bias is the padding mechanism. Depending on several aspects of convolution arithmetic, this mechanism can apply the padding unevenly, leading to asymmetries in the learned weights. We demonstrate how such bias can be detrimental to certain tasks such as small object detection: the activation is suppressed if the stimulus lies in the impacted area, leading to blind spots and misdetection. We propose solutions to mitigate spatial bias and demonstrate how they can improve model accuracy.