CLOct 10, 2022

Language Models Are Poor Learners of Directional Inference

Tianyi Li, Mohammad Javad Hosseini, Sabine Weber, Mark Steedman

arXiv:2210.04695v224.0291 citationsh-index: 61Has Code

Originality Incremental advance

AI Analysis

This addresses a critical issue for NLP researchers by exposing dataset artifacts and model limitations in directional inference, though it is incremental in benchmarking.

The paper tackles the problem of language models' limited ability to learn directional predicate entailments, revealing that existing datasets are flawed and lead to over-optimistic results, and introduces BoOQA as a robust benchmark to show LMs are incompetent learners in this area.

We examine LMs' competence of directional predicate entailments by supervised fine-tuning with prompts. Our analysis shows that contrary to their apparent success on standard NLI, LMs show limited ability to learn such directional inference; moreover, existing datasets fail to test directionality, and/or are infested by artefacts that can be learnt as proxy for entailments, yielding over-optimistic results. In response, we present BoOQA (Boolean Open QA), a robust multi-lingual evaluation benchmark for directional predicate entailments, extrinsic to existing training sets. On BoOQA, we establish baselines and show evidence of existing LM-prompting models being incompetent directional entailment learners, in contrast to entailment graphs, however limited by sparsity.

View on arXiv PDF Code

Similar