MLLGJun 4, 2021

Provably Strict Generalisation Benefit for Invariance in Kernel Methods

arXiv:2106.02346v233 citations
Originality Highly original
AI Analysis

This work provides a rigorous theoretical foundation for invariance in machine learning, addressing a long-standing belief with implications for kernel methods and generalization theory.

The authors tackled the problem of theoretically proving that enforcing invariance improves generalization in kernel ridge regression, and they derived a strictly non-zero generalization benefit when the target is invariant to a compact group's action.

It is a commonly held belief that enforcing invariance improves generalisation. Although this approach enjoys widespread popularity, it is only very recently that a rigorous theoretical demonstration of this benefit has been established. In this work we build on the function space perspective of Elesedy and Zaidi arXiv:2102.10333 to derive a strictly non-zero generalisation benefit of incorporating invariance in kernel ridge regression when the target is invariant to the action of a compact group. We study invariance enforced by feature averaging and find that generalisation is governed by a notion of effective dimension that arises from the interplay between the kernel and the group. In building towards this result, we find that the action of the group induces an orthogonal decomposition of both the reproducing kernel Hilbert space and its kernel, which may be of interest in its own right.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes