Higher Order Gauge Equivariant CNNs on Riemannian Manifolds and Applications
This work addresses the need for efficient and equivariant models in computer vision and neuroimaging, particularly for disease classification from medical images, but it is incremental as it builds on existing gauge equivariant convolutions.
The paper tackles the problem of modeling spatially extended nonlinear interactions while maintaining equivariance to global isometries by introducing a higher order generalization of gauge equivariant convolutions, called gauge equivariant Volterra networks (GEVNet), and demonstrates its parameter efficiency on spherical MNIST and improved classification performance in neuroimaging data for discriminating between Lewy Body Disease, Alzheimer's Disease, and Parkinson's Disease from diffusion MRI.
With the advent of group equivariant convolutions in deep networks literature, spherical CNNs with $\mathsf{SO}(3)$-equivariant layers have been developed to cope with data that are samples of signals on the sphere $S^2$. One can implicitly obtain $\mathsf{SO}(3)$-equivariant convolutions on $S^2$ with significant efficiency gains by explicitly requiring gauge equivariance w.r.t. $\mathsf{SO}(2)$. In this paper, we build on this fact by introducing a higher order generalization of the gauge equivariant convolution, whose implementation is dubbed a gauge equivariant Volterra network (GEVNet). This allows us to model spatially extended nonlinear interactions within a given receptive field while still maintaining equivariance to global isometries. We prove theoretical results regarding the equivariance and construction of higher order gauge equivariant convolutions. Then, we empirically demonstrate the parameter efficiency of our model, first on computer vision benchmark data (e.g. spherical MNIST), and then in combination with a convolutional kernel network (CKN) on neuroimaging data. In the neuroimaging data experiments, the resulting two-part architecture (CKN + GEVNet) is used to automatically discriminate between patients with Lewy Body Disease (DLB), Alzheimer's Disease (AD) and Parkinson's Disease (PD) from diffusion magnetic resonance images (dMRI). The GEVNet extracts micro-architectural features within each voxel, while the CKN extracts macro-architectural features across voxels. This compound architecture is uniquely poised to exploit the intra- and inter-voxel information contained in the dMRI data, leading to improved performance over the classification results obtained from either of the individual components.