CLJul 14, 2020

Can neural networks acquire a structural bias from raw linguistic data?

arXiv:2007.06761v257 citations
AI Analysis

This addresses the problem of understanding if linguistic universals can be learned without innate biases, though it is incremental as it builds on prior work with neural networks.

The study investigated whether BERT can acquire a structural bias from raw linguistic data, finding it makes structural generalizations in 3 out of 4 domains but a linear one in NPI licensing, providing the strongest evidence so far for such acquisition from artificial learners.

We evaluate whether BERT, a widely used neural network for sentence processing, acquires an inductive bias towards forming structural generalizations through pretraining on raw data. We conduct four experiments testing its preference for structural vs. linear generalizations in different structure-dependent phenomena. We find that BERT makes a structural generalization in 3 out of 4 empirical domains---subject-auxiliary inversion, reflexive binding, and verb tense detection in embedded clauses---but makes a linear generalization when tested on NPI licensing. We argue that these results are the strongest evidence so far from artificial learners supporting the proposition that a structural bias can be acquired from raw data. If this conclusion is correct, it is tentative evidence that some linguistic universals can be acquired by learners without innate biases. However, the precise implications for human language acquisition are unclear, as humans learn language from significantly less data than BERT.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes