Rotation equivariant vector field networks
This addresses the need for more efficient and rotation-aware models in computer vision, offering compact solutions for tasks requiring specific rotational responses.
The paper tackled the problem of encoding rotation equivariance in computer vision tasks by proposing Rotation Equivariant Vector Field Networks (RotEqNet), which reduces model complexity and size while maintaining performance, achieving results comparable to much larger networks across tasks like image classification and segmentation.
In many computer vision tasks, we expect a particular behavior of the output with respect to rotations of the input image. If this relationship is explicitly encoded, instead of treated as any other variation, the complexity of the problem is decreased, leading to a reduction in the size of the required model. In this paper, we propose the Rotation Equivariant Vector Field Networks (RotEqNet), a Convolutional Neural Network (CNN) architecture encoding rotation equivariance, invariance and covariance. Each convolutional filter is applied at multiple orientations and returns a vector field representing magnitude and angle of the highest scoring orientation at every spatial location. We develop a modified convolution operator relying on this representation to obtain deep architectures. We test RotEqNet on several problems requiring different responses with respect to the inputs' rotation: image classification, biomedical image segmentation, orientation estimation and patch matching. In all cases, we show that RotEqNet offers extremely compact models in terms of number of parameters and provides results in line to those of networks orders of magnitude larger.