Learning from Frustration: Torsor CNNs on Graphs
This work addresses the limitation of equivariant neural networks in domains with local symmetries, such as multi-view 3D recognition, by providing a foundational framework that generalizes existing architectures, though it appears incremental in extending concepts like Gauge CNNs to arbitrary graphs.
The paper tackled the problem of learning on graphs with local symmetries, which are not handled by existing equivariant neural networks relying on global symmetries, by introducing Torsor CNNs, a framework that is provably equivariant to local coordinate frame changes and includes a frustration loss for regularization, demonstrating applicability in multi-view 3D recognition.
Most equivariant neural networks rely on a single global symmetry, limiting their use in domains where symmetries are instead local. We introduce Torsor CNNs, a framework for learning on graphs with local symmetries encoded as edge potentials -- group-valued transformations between neighboring coordinate frames. We establish that this geometric construction is fundamentally equivalent to the classical group synchronization problem, yielding: (1) a Torsor Convolutional Layer that is provably equivariant to local changes in coordinate frames, and (2) the frustration loss -- a standalone geometric regularizer that encourages locally equivariant representations when added to any NN's training objective. The Torsor CNN framework unifies and generalizes several architectures -- including classical CNNs and Gauge CNNs on manifolds -- by operating on arbitrary graphs without requiring a global coordinate system or smooth manifold structure. We establish the mathematical foundations of this framework and demonstrate its applicability to multi-view 3D recognition, where relative camera poses naturally define the required edge potentials.