CVNov 6, 2024

Harmformer: Harmonic Networks Meet Transformers for Continuous Roto-Translation Equivariance

arXiv:2411.03794v11 citationsh-index: 25
Originality Incremental advance
AI Analysis

This work addresses the need for robust and efficient neural networks in computer vision by enabling continuous rotation equivariance, which is incremental as it builds on existing harmonic and transformer methods.

The paper tackled the problem of achieving continuous rotation equivariance in transformers, introducing Harmformer which outperforms previous equivariant transformers and demonstrates stability under any continuous rotation without rotated training samples.

CNNs exhibit inherent equivariance to image translation, leading to efficient parameter and data usage, faster learning, and improved robustness. The concept of translation equivariant networks has been successfully extended to rotation transformation using group convolution for discrete rotation groups and harmonic functions for the continuous rotation group encompassing $360^\circ$. We explore the compatibility of the SA mechanism with full rotation equivariance, in contrast to previous studies that focused on discrete rotation. We introduce the Harmformer, a harmonic transformer with a convolutional stem that achieves equivariance for both translation and continuous rotation. Accompanied by an end-to-end equivariance proof, the Harmformer not only outperforms previous equivariant transformers, but also demonstrates inherent stability under any continuous rotation, even without seeing rotated samples during training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes