CLAINov 15, 2023

Transformers in the Service of Description Logic-based Contexts

arXiv:2311.08941v3h-index: 47Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for more challenging benchmarks to assess reasoning in AI, though it is incremental as it builds on existing transformer methods.

The authors tackled the problem of evaluating transformer models on complex reasoning tasks by constructing DELTA$_D$, a 384K-example dataset based on description logic, and found that a fine-tuned DeBERTa model mastered the task while GPT-3.5 and GPT-4 showed significant improvement with few-shot prompting (e.g., 9 shots).

Recent advancements in transformer-based models have initiated research interests in investigating their ability to learn to perform reasoning tasks. However, most of the contexts used for this purpose are in practice very simple: generated from short (fragments of) first-order logic sentences with only a few logical operators and quantifiers. In this work, we construct the natural language dataset, DELTA$_D$, using the description logic language $\mathcal{ALCQ}$. DELTA$_D$ contains 384K examples, and increases in two dimensions: i) reasoning depth, and ii) linguistic complexity. In this way, we systematically investigate the reasoning ability of a supervised fine-tuned DeBERTa-based model and of two large language models (GPT-3.5, GPT-4) with few-shot prompting. Our results demonstrate that the DeBERTa-based model can master the reasoning task and that the performance of GPTs can improve significantly even when a small number of samples is provided (9 shots). We open-source our code and datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes