CVMay 18, 2018

Batch Normalization in the final layer of generative networks

arXiv:1805.07389v12 citations
Originality Synthesis-oriented
AI Analysis

This addresses a practical issue for researchers and practitioners in generative modeling, offering an incremental improvement to training efficiency.

The paper challenges the heuristic of avoiding batch normalization in the final layer of generative networks, showing that using it can lead to faster training by aligning the generator with the target distribution's color statistics.

Generative Networks have shown great promise in generating photo-realistic images. Despite this, the theory surrounding them is still an active research area. Much of the useful work with Generative networks rely on heuristics that tend to produce good results. One of these heuristics is the advice not to use Batch Normalization in the final layer of the generator network. Many of the state-of-the-art generative network architectures use this heuristic, but the reasons for doing so are inconsistent. This paper will show that this is not necessarily a good heuristic and that Batch Normalization can be beneficial in the final layer of the generator network either by placing it before the final non-linear activation, usually a $tanh$ or replacing the final $tanh$ activation altogether with Batch Normalization and clipping. We show that this can lead to the faster training of Generator networks by matching the generator to the mean and standard deviation of the target distribution's image colour values.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes