Gradient-Free Learning Based on the Kernel and the Range Space
This work addresses the need for gradient-free optimization in neural networks, offering a novel framework that could benefit scenarios where gradient computation is challenging, though it appears incremental as it builds on existing linear algebra concepts.
The paper tackles the problem of gradient-free learning by showing that solving linear equations via kernel and range space manipulations is equivalent to least squares approximation, enabling adaptation to neural networks with invertible activations; the method achieves relatively good learning capability on real-world datasets of small to moderate dimensions.
In this article, we show that solving the system of linear equations by manipulating the kernel and the range space is equivalent to solving the problem of least squares error approximation. This establishes the ground for a gradient-free learning search when the system can be expressed in the form of a linear matrix equation. When the nonlinear activation function is invertible, the learning problem of a fully-connected multilayer feedforward neural network can be easily adapted for this novel learning framework. By a series of kernel and range space manipulations, it turns out that such a network learning boils down to solving a set of cross-coupling equations. By having the weights randomly initialized, the equations can be decoupled and the network solution shows relatively good learning capability for real world data sets of small to moderate dimensions. Based on the structural information of the matrix equation, the network representation is found to be dependent on the number of data samples and the output dimension.