LGAINEMLMay 16, 2019

Formal derivation of Mesh Neural Networks with their Forward-Only gradient Propagation

arXiv:1905.06684v6
Originality Highly original
AI Analysis

This work addresses the challenge of training flexible neural architectures for large-scale applications, offering a novel alternative to back-propagation.

The paper tackles the problem of training neural networks with arbitrary topologies by proposing Mesh Neural Networks (MNNs), which enable efficient information routing and compute gradients directly from state updates without backward computation, achieving potential scalability for large-scale sparse networks.

This paper proposes the Mesh Neural Network (MNN), a novel architecture which allows neurons to be connected in any topology, to efficiently route information. In MNNs, information is propagated between neurons throughout a state transition function. State and error gradients are then directly computed from state updates without backward computation. The MNN architecture and the error propagation schema is formalized and derived in tensor algebra. The proposed computational model can fully supply a gradient descent process, and is potentially suitable for very large scale sparse NNs, due to its expressivity and training efficiency, with respect to NNs based on back-propagation and computational graphs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes