LGMLMay 24, 2019

Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks

arXiv:1905.10259v558 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of training binary activated networks for efficient inference, though it appears incremental in applying PAC-Bayesian theory to this specific activation type.

The paper tackles training deep neural networks with binary activation functions, which are non-differentiable, by developing an end-to-end framework and providing nonvacuous PAC-Bayesian generalization bounds, with performance validated on real-life datasets.

We present a comprehensive study of multilayer neural networks with binary activation, relying on the PAC-Bayesian theory. Our contributions are twofold: (i) we develop an end-to-end framework to train a binary activated deep neural network, (ii) we provide nonvacuous PAC-Bayesian generalization bounds for binary activated deep neural networks. Our results are obtained by minimizing the expected loss of an architecture-dependent aggregation of binary activated deep neural networks. Our analysis inherently overcomes the fact that binary activation function is non-differentiable. The performance of our approach is assessed on a thorough numerical experiment protocol on real-life datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes