Truly Scale-Equivariant Deep Nets with Fourier Layers
This work addresses a specific problem in computer vision for tasks like image segmentation by providing a truly scale-equivariant solution, though it is incremental as it builds on prior scale-equivariant networks.
The paper tackled the problem of achieving true scale-equivariance in deep networks by addressing anti-aliasing in down-scaling operations, proposing a novel architecture based on Fourier layers that achieved absolute zero equivariance-error while maintaining competitive classification performance on MNIST-scale and STL-10 datasets.
In computer vision, models must be able to adapt to changes in image resolution to effectively carry out tasks such as image segmentation; This is known as scale-equivariance. Recent works have made progress in developing scale-equivariant convolutional neural networks, e.g., through weight-sharing and kernel resizing. However, these networks are not truly scale-equivariant in practice. Specifically, they do not consider anti-aliasing as they formulate the down-scaling operation in the continuous domain. To address this shortcoming, we directly formulate down-scaling in the discrete domain with consideration of anti-aliasing. We then propose a novel architecture based on Fourier layers to achieve truly scale-equivariant deep nets, i.e., absolute zero equivariance-error. Following prior works, we test this model on MNIST-scale and STL-10 datasets. Our proposed model achieves competitive classification performance while maintaining zero equivariance-error.