LGCLOct 31, 2022

Emergent Linguistic Structures in Neural Networks are Fragile

Oxford
arXiv:2210.17406v82 citationsh-index: 14
Originality Incremental advance
AI Analysis

This work addresses the issue of evaluating LLM capabilities beyond accuracy for researchers and practitioners, though it is incremental as it builds on existing probing methods.

The paper tackles the problem of assessing the robustness of linguistic representations in large language models (LLMs) by proposing a framework to measure consistency and fragility, finding that emergent syntactic representations are brittle, with context-free models like GloVe sometimes competitive but equally fragile to syntax-preserving perturbations.

Large Language Models (LLMs) have been reported to have strong performance on natural language processing tasks. However, performance metrics such as accuracy do not measure the quality of the model in terms of its ability to robustly represent complex linguistic structures. In this paper, focusing on the ability of language models to represent syntax, we propose a framework to assess the consistency and robustness of linguistic representations. To this end, we introduce measures of robustness of neural network models that leverage recent advances in extracting linguistic constructs from LLMs via probing tasks, i.e., simple tasks used to extract meaningful information about a single facet of a language model, such as syntax reconstruction and root identification. Empirically, we study the performance of four LLMs across six different corpora on the proposed robustness measures by analysing their performance and robustness with respect to syntax-preserving perturbations. We provide evidence that context-free representation (e.g., GloVe) are in some cases competitive with context-dependent representations from modern LLMs (e.g., BERT), yet equally brittle to syntax-preserving perturbations. Our key observation is that emergent syntactic representations in neural networks are brittle. We make the code, trained models and logs available to the community as a contribution to the debate about the capabilities of LLMs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes