MLLGSTJun 16, 2025

Understanding Learning Invariance in Deep Linear Networks

arXiv:2506.13714v11 citationsh-index: 3
Originality Incremental advance
AI Analysis

This provides theoretical insights into invariance methods for machine learning practitioners, though it is incremental as it builds on existing empirical studies.

The paper theoretically compares three approaches for achieving invariance in deep linear networks: data augmentation, regularization, and hard-wiring, showing that hard-wiring and data augmentation share identical critical points, while regularization introduces extra saddles but converges to the hard-wired solution.

Equivariant and invariant machine learning models exploit symmetries and structural patterns in data to improve sample efficiency. While empirical studies suggest that data-driven methods such as regularization and data augmentation can perform comparably to explicitly invariant models, theoretical insights remain scarce. In this paper, we provide a theoretical comparison of three approaches for achieving invariance: data augmentation, regularization, and hard-wiring. We focus on mean squared error regression with deep linear networks, which parametrize rank-bounded linear maps and can be hard-wired to be invariant to specific group actions. We show that the critical points of the optimization problems for hard-wiring and data augmentation are identical, consisting solely of saddles and the global optimum. By contrast, regularization introduces additional critical points, though they remain saddles except for the global optimum. Moreover, we demonstrate that the regularization path is continuous and converges to the hard-wired solution.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes