LG NAFeb 23

Quantitative Approximation Rates for Group Equivariant Learning

Jonathan W. Siegel, Snir Hordan, Hannah Lawrence, Ali Syed, Nadav Dym

arXiv:2602.20370v12.71 citationsh-index: 1

Originality Incremental advance

AI Analysis

This provides theoretical guarantees for practitioners using equivariant models in domains like physics or chemistry, though it is incremental as it extends existing approximation theory to symmetric settings.

The paper tackles the problem of quantifying approximation rates for group-equivariant neural networks, showing that equivariant architectures like Deep Sets and Transformers achieve the same expressivity as standard MLPs for symmetric functions, with no loss in approximation power.

The universal approximation theorem establishes that neural networks can approximate any continuous function on a compact set. Later works in approximation theory provide quantitative approximation rates for ReLU networks on the class of $α$-Hölder functions $f: [0,1]^N \to \mathbb{R}$. The goal of this paper is to provide similar quantitative approximation results in the context of group equivariant learning, where the learned $α$-Hölder function is known to obey certain group symmetries. While there has been much interest in the literature in understanding the universal approximation properties of equivariant models, very few quantitative approximation results are known for equivariant models. In this paper, we bridge this gap by deriving quantitative approximation rates for several prominent group-equivariant and invariant architectures. The architectures that we consider include: the permutation-invariant Deep Sets architecture; the permutation-equivariant Sumformer and Transformer architectures; joint invariance to permutations and rigid motions using invariant networks based on frame averaging; and general bi-Lipschitz invariant models. Overall, we show that equally-sized ReLU MLPs and equivariant architectures are equally expressive over equivariant functions. Thus, hard-coding equivariance does not result in a loss of expressivity or approximation power in these models.

View on arXiv PDF

Similar