LGMLJul 17, 2018

Learning Neuron Non-Linearities with Kernel-Based Deep Neural Networks

arXiv:1807.06302v21 citations
Originality Incremental advance
AI Analysis

This work addresses the crucial role of activation functions in neural network complexity, offering a method to enhance performance in tasks like sequence modeling, though it appears incremental as it builds on existing regularization frameworks.

The paper tackles the problem of selecting optimal neuron activation functions in deep neural networks by proposing a kernel-based expansion approach, showing it can outperform state-of-the-art LSTM cells in capturing long-term dependencies in challenging experiments.

The effectiveness of deep neural architectures has been widely supported in terms of both experimental and foundational principles. There is also clear evidence that the activation function (e.g. the rectifier and the LSTM units) plays a crucial role in the complexity of learning. Based on this remark, this paper discusses an optimal selection of the neuron non-linearity in a functional framework that is inspired from classic regularization arguments. It is shown that the best activation function is represented by a kernel expansion in the training set, that can be effectively approximated over an opportune set of points modeling 1-D clusters. The idea can be naturally extended to recurrent networks, where the expressiveness of kernel-based activation functions turns out to be a crucial ingredient to capture long-term dependencies. We give experimental evidence of this property by a set of challenging experiments, where we compare the results with neural architectures based on state of the art LSTM cells.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes