CLAIJun 28, 2025

A Systematic Study of Compositional Syntactic Transformer Language Models

arXiv:2506.22978v11 citationsh-index: 6Has CodeACL
Originality Incremental advance
AI Analysis

This work addresses the problem of enhancing language models with syntactic biases for researchers and practitioners, but it is incremental as it builds on existing compositional SLMs.

The paper systematically studies compositional syntactic Transformer language models, identifying key design choices and proposing a unified framework with novel variants, and finds that certain configurations improve language modeling and syntactic generalization, with up to 15% better perplexity on some datasets.

Syntactic language models (SLMs) enhance Transformers by incorporating syntactic biases through the modeling of linearized syntactic parse trees alongside surface sentences. This paper focuses on compositional SLMs that are based on constituency parse trees and contain explicit bottom-up composition of constituent representations. We identify key aspects of design choices in existing compositional SLMs and propose a unified framework encompassing both existing models and novel variants. We conduct a comprehensive empirical evaluation of all the variants in our framework across language modeling, syntactic generalization, summarization, dialogue, and inference efficiency. Based on the experimental results, we make multiple recommendations on the design of compositional SLMs. Our code is released at https://github.com/zhaoyd1/compositional_SLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes