LGMLDec 18, 2019

Tangent Space Separability in Feedforward Neural Networks

arXiv:1912.09306v15 citations
Originality Incremental advance
AI Analysis

This addresses the problem of high parameter counts and training time in deep learning for practitioners, though it appears incremental as it builds on existing network architectures.

The paper tackles the inefficiency of hierarchical neural networks by proposing a sparse representation that approximates the tangent subspace, enabling a switch to shallow networks (GradNet) early in training. The result shows that this approximation improves or surpasses the performance of the original network significantly after only a few epochs.

Hierarchical neural networks are exponentially more efficient than their corresponding "shallow" counterpart with the same expressive power, but involve huge number of parameters and require tedious amounts of training. By approximating the tangent subspace, we suggest a sparse representation that enables switching to shallow networks, GradNet after a very early training stage. Our experiments show that the proposed approximation of the metric improves and sometimes even surpasses the achievable performance of the original network significantly even after a few epochs of training the original feedforward network.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes