Tunable Soft Equivariance with Guarantees
This addresses the need for flexible equivariance control in computer vision models, offering a general solution with theoretical guarantees, though it is incremental as it builds on existing pre-trained architectures.
The paper tackles the problem of strict equivariance limiting performance in real-world computer vision data by proposing a framework for tunable soft equivariance, which improves performance and reduces equivariance error on benchmarks like ImageNet.
Equivariance is a fundamental property in computer vision models, yet strict equivariance is rarely satisfied in real-world data, which can limit a model's performance. Controlling the degree of equivariance is therefore desirable. We propose a general framework for constructing soft equivariant models by projecting the model weights into a designed subspace. The method applies to any pre-trained architecture and provides theoretical bounds on the induced equivariance error. Empirically, we demonstrate the effectiveness of our method on multiple pre-trained backbones, including ViT and ResNet, across image classification, semantic segmentation, and human-trajectory prediction tasks. Notably, our approach improves the performance while simultaneously reducing equivariance error on the competitive ImageNet benchmark.