CLOct 24, 2022

An Empirical Revisiting of Linguistic Knowledge Fusion in Language Understanding Tasks

Changlong Yu, Tianyi Xiao, Lingpeng Kong, Yangqiu Song, Wilfred Ng

arXiv:2210.13002v123.9291 citationsh-index: 52Has Code

Originality Incremental advance

AI Analysis

This work challenges the assumption that explicit linguistic knowledge is necessary for improvements in language understanding tasks, calling for better baselines in future research.

The study investigated the effectiveness of explicit linguistic priors in language models by replacing parsed graphs with trivial ones in GLUE benchmark tasks, finding that trivial graphs achieved competitive or better performance, suggesting gains may come from feature interactions rather than linguistic knowledge.

Though linguistic knowledge emerges during large-scale language model pretraining, recent work attempt to explicitly incorporate human-defined linguistic priors into task-specific fine-tuning. Infusing language models with syntactic or semantic knowledge from parsers has shown improvements on many language understanding tasks. To further investigate the effectiveness of structural linguistic priors, we conduct empirical study of replacing parsed graphs or trees with trivial ones (rarely carrying linguistic knowledge e.g., balanced tree) for tasks in the GLUE benchmark. Encoding with trivial graphs achieves competitive or even better performance in fully-supervised and few-shot settings. It reveals that the gains might not be significantly attributed to explicit linguistic priors but rather to more feature interactions brought by fusion layers. Hence we call for attention to using trivial graphs as necessary baselines to design advanced knowledge fusion methods in the future.

View on arXiv PDF Code

Similar