CLOct 10, 2022

Language Models Are Poor Learners of Directional Inference

arXiv:2210.04695v2291 citationsh-index: 61
Originality Incremental advance
AI Analysis

This addresses a critical issue for NLP researchers by exposing dataset artifacts and model limitations in directional inference, though it is incremental in benchmarking.

The paper tackles the problem of language models' limited ability to learn directional predicate entailments, revealing that existing datasets are flawed and lead to over-optimistic results, and introduces BoOQA as a robust benchmark to show LMs are incompetent learners in this area.

We examine LMs' competence of directional predicate entailments by supervised fine-tuning with prompts. Our analysis shows that contrary to their apparent success on standard NLI, LMs show limited ability to learn such directional inference; moreover, existing datasets fail to test directionality, and/or are infested by artefacts that can be learnt as proxy for entailments, yielding over-optimistic results. In response, we present BoOQA (Boolean Open QA), a robust multi-lingual evaluation benchmark for directional predicate entailments, extrinsic to existing training sets. On BoOQA, we establish baselines and show evidence of existing LM-prompting models being incompetent directional entailment learners, in contrast to entailment graphs, however limited by sparsity.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes