LGMLJul 11, 2019

Order and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary Artifacts

arXiv:1907.05715v232 citations
Originality Incremental advance
AI Analysis

This work addresses training and generalization issues in deep neural networks, particularly for generative models, by providing theoretical insights and practical fixes for artifacts like checkerboard patterns, though it is incremental in building on existing NTK theory.

The paper analyzes deep neural networks using the Neural Tangent Kernel, identifying 'order' and 'chaos' regimes that affect training and generalization, and explains how these regimes relate to checkerboard patterns and border artifacts in deconvolutional networks, proposing solutions that improve DCGAN sample quality without batch normalization.

We analyze architectural features of Deep Neural Networks (DNNs) using the so-called Neural Tangent Kernel (NTK), which describes the training and generalization of DNNs in the infinite-width setting. In this setting, we show that for fully-connected DNNs, as the depth grows, two regimes appear: "order", where the (scaled) NTK converges to a constant, and "chaos", where it converges to a Kronecker delta. Extreme order slows down training while extreme chaos hinders generalization. Using the scaled ReLU as a nonlinearity, we end up in the ordered regime. In contrast, Layer Normalization brings the network into the chaotic regime. We observe a similar effect for Batch Normalization (BN) applied after the last nonlinearity. We uncover the same order and chaos modes in Deep Deconvolutional Networks (DC-NNs). Our analysis explains the appearance of so-called checkerboard patterns and border artifacts. Moving the network into the chaotic regime prevents checkerboard patterns; we propose a graph-based parametrization which eliminates border artifacts; finally, we introduce a new layer-dependent learning rate to improve the convergence of DC-NNs. We illustrate our findings on DCGANs: the ordered regime leads to a collapse of the generator to a checkerboard mode, which can be avoided by tuning the nonlinearity to reach the chaotic regime. As a result, we are able to obtain good quality samples for DCGANs without BN.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes