LG MLMay 9, 2021

Directional Convergence Analysis under Spherically Symmetric Distribution

arXiv:2105.03879v11.6

Originality Incremental advance

AI Analysis

This work addresses theoretical convergence issues in neural network training for separable datasets, though it is incremental as it builds on prior assumptions and focuses on specific network architectures.

The paper tackles the problem of learning linear predictors with neural networks under spherically symmetric data distributions, showing directional convergence guarantees with exact convergence rates for two-layer non-linear networks with two hidden nodes and deep linear networks.

We consider the fundamental problem of learning linear predictors (i.e., separable datasets with zero margin) using neural networks with gradient flow or gradient descent. Under the assumption of spherically symmetric data distribution, we show directional convergence guarantees with exact convergence rate for two-layer non-linear networks with only two hidden nodes, and (deep) linear networks. Moreover, our discovery is built on dynamic from the initialization without both initial loss and perfect classification constraint in contrast to previous works. We also point out and study the challenges in further strengthening and generalizing our results.

View on arXiv PDF

Similar