CL AIMay 20, 2025

Exploring Graph Representations of Logical Forms for Language Modeling

arXiv:2505.14523v21 citationsACL

Originality Incremental advance

AI Analysis

This addresses data efficiency for natural language processing applications, but it is incremental as it builds on existing logical form and graph representation ideas.

The paper tackles the problem of data inefficiency in language models by proposing language models over logical forms (LFLMs), showing that the Graph-based Formal-Logical Distributional Semantics (GFoLDS) prototype vastly outperforms BERT on downstream tasks with the same data, indicating LFLMs can learn with substantially less data.

We make the case for language models over logical forms (LFLMs), arguing that such models are more data-efficient than their textual counterparts. To that end, we introduce the Graph-based Formal-Logical Distributional Semantics (GFoLDS) prototype, a pretrained LM over graph representations of logical forms, as a proof-of-concept of LFLMs. Using GFoLDS, we present strong experimental evidence that LFLMs can leverage the built-in, basic linguistic knowledge inherent in such models to immediately begin learning more complex patterns. On downstream tasks, we show that GFoLDS vastly outperforms textual, transformer LMs (BERT) pretrained on the same data, indicating that LFLMs can learn with substantially less data than models over plain text. Furthermore, we show that the performance of this model is likely to scale with additional parameters and pretraining data, suggesting the viability of LFLMs in real-world applications.

View on arXiv PDF

Similar