Learning to Convolve: A Generalized Weight-Tying Approach
This work addresses a specific bottleneck in group convolution methods for computer vision, offering an incremental improvement by automating filter transformation learning.
The paper tackles the difficulty of manually defining filter transformations for group convolutions by learning a filter basis and its rotated versions, enabling filters to be rotated by switching the basis. It demonstrates that this approach produces feature maps with low sensitivity to input rotations while achieving high performance on MNIST and CIFAR-10, with specific accuracy numbers not provided in the abstract.
Recent work (Cohen & Welling, 2016) has shown that generalizations of convolutions, based on group theory, provide powerful inductive biases for learning. In these generalizations, filters are not only translated but can also be rotated, flipped, etc. However, coming up with exact models of how to rotate a 3 x 3 filter on a square pixel-grid is difficult. In this paper, we learn how to transform filters for use in the group convolution, focussing on roto-translation. For this, we learn a filter basis and all rotated versions of that filter basis. Filters are then encoded by a set of rotation invariant coefficients. To rotate a filter, we switch the basis. We demonstrate we can produce feature maps with low sensitivity to input rotations, while achieving high performance on MNIST and CIFAR-10.