LGSTAT-MECHMLMay 2, 2017

Random active path model of deep neural networks with diluted binary synapses

arXiv:1705.00850v3
Originality Incremental advance
AI Analysis

This work addresses the theoretical understanding of deep learning mechanisms, which is an incremental contribution to the field of machine learning theory.

The authors tackled the challenge of theoretically understanding deep learning by proposing a random active path model to study collective properties of deep neural networks with binary synapses under connection removal perturbations. They observed a critical perturbation value separating spin glass and paramagnetic regimes, with the paramagnetic phase conjectured to have poor generalization performance.

Deep learning has become a powerful and popular tool for a variety of machine learning tasks. However, it is challenging to understand the mechanism of deep learning from a theoretical perspective. In this work, we propose a random active path model to study collective properties of deep neural networks with binary synapses, under the removal perturbation of connections between layers. In the model, the path from input to output is randomly activated, and the corresponding input unit constrains the weights along the path into the form of a $p$-weight interaction glass model. A critical value of the perturbation is observed to separate a spin glass regime from a paramagnetic regime, with the transition being of the first order. The paramagnetic phase is conjectured to have a poor generalization performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes