CLMay 15

GiLT: Augmenting Transformer Language Models with Dependency Graphs

arXiv:2605.1556296.1Has Code
Predicted impact top 8% in CL · last 90 daysOriginality Incremental advance
AI Analysis

For NLP researchers, GiLT offers a novel method to inject syntactic structure into language models without extra tokens, improving syntactic generalization.

GiLT augments Transformer language models with dependency graphs by modulating attention weights, achieving better syntactic generalization while maintaining competitive perplexity compared to baselines.

Augmenting Transformers with linguistic structures effectively enhances the syntactic generalization performance of language models. Previous work in this direction focuses on syntactic tree structures of languages, in particular constituency tree structures. We propose Graph-Infused Layers Transformer Language Model (GiLT) which leverages dependency graphs for augmenting Transformer language models. Unlike most previous work, GiLT does not insert extra structural tokens in language modeling; instead, it injects structural information into language modeling by modulating attention weights in the Transformer with features extracted from the dependency graph that is incrementally constructed along with token prediction. In our experiments, GiLT with semantic dependency graphs achieves better syntactic generalization while maintaining competitive perplexity in comparison with Transformer language model baselines. In addition, GiLT can be finetuned from a pretrained language model to achieve improved downstream task performance. Our code is released at https://github.com/cookie-pie-oops/GiLT-LM.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes