DIS-NN AI LGSep 14, 2023

The kernel-balanced equation for deep neural networks

arXiv:2309.07367v1

Originality Synthesis-oriented

AI Analysis

This addresses instability issues in deep learning for distribution estimation, but appears incremental as it builds on existing phenomenological descriptions.

The authors tackled the instability in deep neural networks' distribution estimation, showing it depends on data density and training duration, and derived a kernel-balanced equation to explain the mechanism, with the scale of averaging decreasing over training leading to instability.

Deep neural networks have shown many fruitful applications in this decade. A network can get the generalized function through training with a finite dataset. The degree of generalization is a realization of the proximity scale in the data space. Specifically, the scale is not clear if the dataset is complicated. Here we consider a network for the distribution estimation of the dataset. We show the estimation is unstable and the instability depends on the data density and training duration. We derive the kernel-balanced equation, which gives a short phenomenological description of the solution. The equation tells us the reason for the instability and the mechanism of the scale. The network outputs a local average of the dataset as a prediction and the scale of averaging is determined along the equation. The scale gradually decreases along training and finally results in instability in our case.

View on arXiv PDF

Similar