CLAug 8, 2024

Analysis of Argument Structure Constructions in the Large Language Model BERT

Pegah Ramezani, Achim Schilling, Patrick Krauss

arXiv:2408.04270v13.47 citationsh-index: 5

Originality Synthesis-oriented

AI Analysis

This research provides insights into linguistic processing in neural language models, with potential implications for understanding human brain mechanisms, but it is incremental as it extends prior LSTM analyses to BERT.

This study analyzed how BERT processes Argument Structure Constructions (ASCs) using a dataset of 2000 sentences, finding that probe accuracies exceeded 90% from layer 2 onward and that OBJ tokens were crucial for differentiating ASCs based on attention weight analysis.

This study investigates how BERT processes and represents Argument Structure Constructions (ASCs), extending previous LSTM analyses. Using a dataset of 2000 sentences across four ASC types (transitive, ditransitive, caused-motion, resultative), we analyzed BERT's token embeddings across 12 layers. Visualizations with MDS and t-SNE and clustering quantified by Generalized Discrimination Value (GDV) were used. Feedforward classifiers (probes) predicted construction categories from embeddings. CLS token embeddings clustered best in layers 2-4, decreased in intermediate layers, and slightly increased in final layers. DET and SUBJ embeddings showed consistent clustering in intermediate layers, VERB embeddings increased in clustering from layer 1 to 12, and OBJ embeddings peaked in layer 10. Probe accuracies indicated low construction information in layer 1, with over 90 percent accuracy from layer 2 onward, revealing latent construction information beyond GDV clustering. Fisher Discriminant Ratio (FDR) analysis of attention weights showed OBJ tokens were crucial for differentiating ASCs, followed by VERB and DET tokens. SUBJ, CLS, and SEP tokens had insignificant FDR scores. This study highlights BERT's layered processing of linguistic constructions and its differences from LSTMs. Future research will compare these findings with neuroimaging data to understand the neural correlates of ASC processing. This research underscores neural language models' potential to mirror linguistic processing in the human brain, offering insights into the computational and neural mechanisms underlying language understanding.

View on arXiv PDF

Similar