MLLGNov 19, 2017

Convergence Analysis of the Dynamics of a Special Kind of Two-Layered Neural Networks with $\ell_1$ and $\ell_2$ Regularization

arXiv:1711.07005v1
Originality Synthesis-oriented
AI Analysis

This work provides theoretical guarantees for convergence in regularized neural networks, which is incremental for researchers in optimization and machine learning.

The paper extends convergence analysis for two-layered neural networks with ReLU output by incorporating ℓ₁ and ℓ₂ regularization into the loss function, proving that with small regularization, the weight vector converges to the optimal solution with a specified probability under random initialization, supported by numerical experiments.

In this paper, we made an extension to the convergence analysis of the dynamics of two-layered bias-free networks with one $ReLU$ output. We took into consideration two popular regularization terms: the $\ell_1$ and $\ell_2$ norm of the parameter vector $w$, and added it to the square loss function with coefficient $λ/2$. We proved that when $λ$ is small, the weight vector $w$ converges to the optimal solution $\hat{w}$ (with respect to the new loss function) with probability $\geq (1-\varepsilon)(1-A_d)/2$ under random initiations in a sphere centered at the origin, where $\varepsilon$ is a small value and $A_d$ is a constant. Numerical experiments including phase diagrams and repeated simulations verified our theory.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes