LG OCApr 9

Mathematical analysis of one-layer neural network with fixed biases, a new activation function and other observations

arXiv:2604.077155.2h-index: 2

Predicted impact top 96% in LG · last 90 daysOriginality Synthesis-oriented

AI Analysis

This work provides theoretical insights into neural network training dynamics, which is incremental for researchers in machine learning theory.

The authors analyzed a simple one-layer neural network with ReLU activations and fixed biases, proving convergence of gradient descent with L^2 loss and the spectral bias property, and proposed a new activation function called FReX.

We analyze a simple one-hidden-layer neural network with ReLU activation functions and fixed biases, with one-dimensional input and output. We study both continuous and discrete versions of the model, and we rigorously prove the convergence of the learning process with the $L^2$ squared loss function and the gradient descent procedure. We also prove the spectral bias property for this learning process. Several conclusions of this analysis are discussed; in particular, regarding the structure and properties that activation functions should possess, as well as the relationships between the spectrum of certain operators and the learning process. Based on this, we also propose an alternative activation function, the full-wave rectified exponential function (FReX), and we discuss the convergence of the gradient descent with this alternative activation function.

View on arXiv PDF

Similar