CLAILGMar 29, 2023

Zero-shot Entailment of Leaderboards for Empirical AI Research

arXiv:2303.16835v113 citationsh-index: 55
Originality Synthesis-oriented
AI Analysis

This addresses the problem of model generalization and entailment learning for researchers in AI and NLP, but it is incremental as it builds on prior work without introducing a new method.

The paper investigates whether state-of-the-art models for extracting leaderboards in empirical AI research, formulated as a recognizing textual entailment task, actually learn entailment by testing them in a zero-shot setting on unseen labels, resulting in the creation of a zero-shot labeled dataset.

We present a large-scale empirical investigation of the zero-shot learning phenomena in a specific recognizing textual entailment (RTE) task category, i.e. the automated mining of leaderboards for Empirical AI Research. The prior reported state-of-the-art models for leaderboards extraction formulated as an RTE task, in a non-zero-shot setting, are promising with above 90% reported performances. However, a central research question remains unexamined: did the models actually learn entailment? Thus, for the experiments in this paper, two prior reported state-of-the-art models are tested out-of-the-box for their ability to generalize or their capacity for entailment, given leaderboard labels that were unseen during training. We hypothesize that if the models learned entailment, their zero-shot performances can be expected to be moderately high as well--perhaps, concretely, better than chance. As a result of this work, a zero-shot labeled dataset is created via distant labeling formulating the leaderboard extraction RTE task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes