LG AI MLFeb 28, 2025

1-Lipschitz Network Initialization for Certifiably Robust Classification Applications: A Decay Problem

Marius F. R. Juston, Ramavarapu S. Sreenivas, William R. Norris, Dustin Nottage, Ahmet Soylemezoglu

arXiv:2503.00240v22 citationsh-index: 5

Originality Synthesis-oriented

AI Analysis

This addresses initialization problems for researchers developing certifiably robust neural networks against adversarial attacks, but appears incremental as it analyzes existing architectures.

The paper analyzed weight initialization in 1-Lipschitz neural networks (AOL and SLL architectures) for certifiably robust classification, finding that weight variance doesn't affect output variance and that deep networks always decay to zero regardless of initialization distribution.

This paper discusses the weight parametrization of two standard 1-Lipschitz network architectures, the Almost-Orthogonal-Layers (AOL) and the SDP-based Lipschitz Layers (SLL). It examines their impact on initialization for deep 1-Lipschitz feedforward networks, and discusses underlying issues surrounding this initialization. These networks are mainly used in certifiably robust classification applications to combat adversarial attacks by limiting the impact of perturbations on the classification output. Exact and upper bounds for the parameterized weight variance were calculated assuming a standard Normal distribution initialization; additionally, an upper bound was computed assuming a Generalized Normal Distribution, generalizing the proof for Uniform, Laplace, and Normal distribution weight initializations. It is demonstrated that the weight variance holds no bearing on the output variance distribution and that only the dimension of the weight matrices matters. Additionally, this paper demonstrates that the weight initialization always causes deep 1-Lipschitz networks to decay to zero.

View on arXiv PDF

Similar