MLLGFeb 20, 2021

Provably Strict Generalisation Benefit for Equivariant Models

arXiv:2102.10333v2106 citations
Originality Highly original
AI Analysis

This provides foundational theoretical insights for machine learning practitioners using equivariant models, though it is incremental as it focuses on linear cases.

The paper tackles the lack of precise characterisation of generalisation benefits for invariant/equivariant models by proving the first provably non-zero improvement in generalisation for linear models when the target distribution is invariant/equivariant with respect to a compact group, revealing relationships between generalisation, training examples, and group properties.

It is widely believed that engineering a model to be invariant/equivariant improves generalisation. Despite the growing popularity of this approach, a precise characterisation of the generalisation benefit is lacking. By considering the simplest case of linear models, this paper provides the first provably non-zero improvement in generalisation for invariant/equivariant models when the target distribution is invariant/equivariant with respect to a compact group. Moreover, our work reveals an interesting relationship between generalisation, the number of training examples and properties of the group action. Our results rest on an observation of the structure of function spaces under averaging operators which, along with its consequences for feature averaging, may be of independent interest.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes