Spatially-Adaptive Filter Units for Deep Neural Networks
This work offers a more efficient and adaptive convolutional unit for deep neural networks, potentially benefiting researchers and practitioners in computer vision by reducing model complexity and improving convergence.
This paper introduces a novel displaced aggregation unit (DAU) that learns to spatially adapt its receptive field, eliminating the need for hand-crafted dilated convolutions. Networks using DAUs achieve comparable performance to classical ConvNets with up to a 3-times reduction in parameters and faster convergence.
Classical deep convolutional networks increase receptive field size by either gradual resolution reduction or application of hand-crafted dilated convolutions to prevent increase in the number of parameters. In this paper we propose a novel displaced aggregation unit (DAU) that does not require hand-crafting. In contrast to classical filters with units (pixels) placed on a fixed regular grid, the displacement of the DAUs are learned, which enables filters to spatially-adapt their receptive field to a given problem. We extensively demonstrate the strength of DAUs on a classification and semantic segmentation tasks. Compared to ConvNets with regular filter, ConvNets with DAUs achieve comparable performance at faster convergence and up to 3-times reduction in parameters. Furthermore, DAUs allow us to study deep networks from novel perspectives. We study spatial distributions of DAU filters and analyze the number of parameters allocated for spatial coverage in a filter.