LGJan 28, 2022

Limitation of Characterizing Implicit Regularization by Data-independent Functions

arXiv:2201.12198v21 citations
Originality Incremental advance
AI Analysis

This work addresses a central theoretical challenge in deep learning by highlighting the data dependency of implicit regularization, though it is incremental as it builds on prior studies.

The paper tackles the problem of mathematically defining and studying implicit regularization in neural networks, specifically exploring the limitations of characterizing it with data-independent functions, and proposes two dynamical mechanisms to produce classes of one-hidden-neuron networks that cannot be fully characterized by such functions.

In recent years, understanding the implicit regularization of neural networks (NNs) has become a central task in deep learning theory. However, implicit regularization is itself not completely defined and well understood. In this work, we attempt to mathematically define and study implicit regularization. Importantly, we explore the limitations of a common approach to characterizing implicit regularization using data-independent functions. We propose two dynamical mechanisms, i.e., Two-point and One-point Overlapping mechanisms, based on which we provide two recipes for producing classes of one-hidden-neuron NNs that provably cannot be fully characterized by a type of or all data-independent functions. Following the previous works, our results further emphasize the profound data dependency of implicit regularization in general, inspiring us to study in detail the data dependency of NN implicit regularization in the future.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes