LGCVOct 4, 2019

Farkas layers: don't shift the data, fix the geometry

arXiv:1910.02840v11 citations
Originality Incremental advance
AI Analysis

This addresses a fundamental problem in deep learning for researchers and practitioners by offering an alternative to batch normalization and weight initialization, though it appears incremental as it builds on existing linear programming results.

The paper tackles the challenge of training deep neural networks without batch normalization or careful weight initialization by introducing Farkas layers, a geometrically motivated method that ensures at least one neuron is active per layer; it demonstrates significant improvements in training capacity across various network sizes on benchmark datasets.

Successfully training deep neural networks often requires either batch normalization, appropriate weight initialization, both of which come with their own challenges. We propose an alternative, geometrically motivated method for training. Using elementary results from linear programming, we introduce Farkas layers: a method that ensures at least one neuron is active at a given layer. Focusing on residual networks with ReLU activation, we empirically demonstrate a significant improvement in training capacity in the absence of batch normalization or methods of initialization across a broad range of network sizes on benchmark datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes