LGAIMLMay 26, 2019

State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations

arXiv:1905.11382v15 citations
Originality Incremental advance
AI Analysis

This addresses the problem of poor generalization in neural networks for machine learning practitioners, though it appears incremental as it builds on existing methods by adding a regularization-like technique.

The paper tackles the brittleness of neural networks, such as adversarial examples and issues with recurrent nets, by introducing state reification, which models the distribution of hidden states during training and projects test states toward it, resulting in improved generalization, especially with sparse labeled data and in adversarial training scenarios.

Machine learning promises methods that generalize well from finite labeled data. However, the brittleness of existing neural net approaches is revealed by notable failures, such as the existence of adversarial examples that are misclassified despite being nearly identical to a training example, or the inability of recurrent sequence-processing nets to stay on track without teacher forcing. We introduce a method, which we refer to as \emph{state reification}, that involves modeling the distribution of hidden states over the training data and then projecting hidden states observed during testing toward this distribution. Our intuition is that if the network can remain in a familiar manifold of hidden space, subsequent layers of the net should be well trained to respond appropriately. We show that this state-reification method helps neural nets to generalize better, especially when labeled data are sparse, and also helps overcome the challenge of achieving robust generalization with adversarial training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes