Representation Learning on Unit Ball with 3D Roto-Translational Equivariance
This work addresses the problem of 3D data representation for researchers in computer vision and machine learning, offering a novel method for handling volumetric data with equivariance, though it is incremental in extending convolution to a specific topological space.
The paper tackled the challenge of extending convolution to non-Euclidean spaces like the unit ball (𝔹³) by proposing a novel volumetric convolution operation based on Zernike polynomials, and demonstrated its efficacy in 3D object recognition with improved performance metrics.
Convolution is an integral operation that defines how the shape of one function is modified by another function. This powerful concept forms the basis of hierarchical feature learning in deep neural networks. Although performing convolution in Euclidean geometries is fairly straightforward, its extension to other topological spaces---such as a sphere ($\mathbb{S}^2$) or a unit ball ($\mathbb{B}^3$)---entails unique challenges. In this work, we propose a novel `\emph{volumetric convolution}' operation that can effectively model and convolve arbitrary functions in $\mathbb{B}^3$. We develop a theoretical framework for \emph{volumetric convolution} based on Zernike polynomials and efficiently implement it as a differentiable and an easily pluggable layer in deep networks. By construction, our formulation leads to the derivation of a novel formula to measure the symmetry of a function in $\mathbb{B}^3$ around an arbitrary axis, that is useful in function analysis tasks. We demonstrate the efficacy of proposed volumetric convolution operation on one viable use case i.e., 3D object recognition.