CLAINov 15, 2023

AbsPyramid: Benchmarking the Abstraction Ability of Language Models with a Unified Entailment Graph

Tencent
arXiv:2311.09174v339 citationsh-index: 18
Originality Incremental advance
AI Analysis

This work addresses the need for comprehensive evaluation of abstraction ability in language models across diverse events, which is incremental as it builds on existing resources but extends them to a broader domain.

The authors tackled the under-explored problem of abstraction ability in language models by introducing AbsPyramid, a unified entailment graph with 221K textual descriptions, and found that current LLMs struggle with abstraction in zero-shot and few-shot settings but can acquire basic abilities through training on this benchmark.

Cognitive research indicates that abstraction ability is essential in human intelligence, which remains under-explored in language models. In this paper, we present AbsPyramid, a unified entailment graph of 221K textual descriptions of abstraction knowledge. While existing resources only touch nouns or verbs within simplified events or specific domains, AbsPyramid collects abstract knowledge for three components of diverse events to comprehensively evaluate the abstraction ability of language models in the open domain. Experimental results demonstrate that current LLMs face challenges comprehending abstraction knowledge in zero-shot and few-shot settings. By training on our rich abstraction knowledge, we find LLMs can acquire basic abstraction abilities and generalize to unseen events. In the meantime, we empirically show that our benchmark is comprehensive to enhance LLMs across two previous abstraction tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes