CLNAMay 9, 2020

Generalizing Outside the Training Set: When Can Neural Networks Learn Identity Effects?

arXiv:2005.04330v16 citations
Originality Incremental advance
AI Analysis

This addresses a fundamental limitation in machine learning for cognitive tasks like language, showing incremental insights into generalization failures.

The paper tackles the problem of whether neural networks can learn identity effects—constraints where object well-formedness depends on component identity—from data without explicit guidance, and proves that a broad class of algorithms, including standard deep neural networks, cannot make correct inferences under certain input encoding conditions, as demonstrated through computational experiments.

Often in language and other areas of cognition, whether two components of an object are identical or not determine whether it is well formed. We call such constraints identity effects. When developing a system to learn well-formedness from examples, it is easy enough to build in an identify effect. But can identity effects be learned from the data without explicit guidance? We provide a simple framework in which we can rigorously prove that algorithms satisfying simple criteria cannot make the correct inference. We then show that a broad class of algorithms including deep neural networks with standard architecture and training with backpropagation satisfy our criteria, dependent on the encoding of inputs. Finally, we demonstrate our theory with computational experiments in which we explore the effect of different input encodings on the ability of algorithms to generalize to novel inputs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes