LG CVOct 4, 2019

Farkas layers: don't shift the data, fix the geometry

Aram-Alexandre Pooladian, Chris Finlay, Adam M Oberman

arXiv:1910.02840v11.81 citations

Originality Incremental advance

AI Analysis

This addresses a fundamental problem in deep learning for researchers and practitioners by offering an alternative to batch normalization and weight initialization, though it appears incremental as it builds on existing linear programming results.

The paper tackles the challenge of training deep neural networks without batch normalization or careful weight initialization by introducing Farkas layers, a geometrically motivated method that ensures at least one neuron is active per layer; it demonstrates significant improvements in training capacity across various network sizes on benchmark datasets.

Successfully training deep neural networks often requires either batch normalization, appropriate weight initialization, both of which come with their own challenges. We propose an alternative, geometrically motivated method for training. Using elementary results from linear programming, we introduce Farkas layers: a method that ensures at least one neuron is active at a given layer. Focusing on residual networks with ReLU activation, we empirically demonstrate a significant improvement in training capacity in the absence of batch normalization or methods of initialization across a broad range of network sizes on benchmark datasets.

View on arXiv PDF

Similar