Lifted Neural Networks
This is an incremental improvement for neural network training, potentially benefiting researchers and practitioners in machine learning by offering faster or more robust initialization methods.
The paper tackles the problem of training multi-layer feedforward neural networks by encoding activation functions as penalties, enabling algorithms like block-coordinate descent with parallelizable steps. The result is that the models provide excellent initial guesses for weights in standard neural networks, as indicated by experiments.
We describe a novel family of models of multi- layer feedforward neural networks in which the activation functions are encoded via penalties in the training problem. Our approach is based on representing a non-decreasing activation function as the argmin of an appropriate convex optimiza- tion problem. The new framework allows for algo- rithms such as block-coordinate descent methods to be applied, in which each step is composed of a simple (no hidden layer) supervised learning problem that is parallelizable across data points and/or layers. Experiments indicate that the pro- posed models provide excellent initial guesses for weights for standard neural networks. In addi- tion, the model provides avenues for interesting extensions, such as robustness against noisy in- puts and optimizing over parameters in activation functions.