CLJan 7, 2022

The Defeat of the Winograd Schema Challenge

Vid Kocijan, Ernest Davis, Thomas Lukasiewicz, Gary Marcus, Leora Morgenstern

arXiv:2201.02387v34.551 citations

Originality Synthesis-oriented

AI Analysis

It provides a retrospective analysis for the AI research community on how surrogate tasks like WSC have shaped understanding of AI intelligence assessment.

The paper reviews the history of the Winograd Schema Challenge, a benchmark for pronoun disambiguation requiring commonsense knowledge, and notes that by 2019, AI systems achieved over 90% accuracy on it.

The Winograd Schema Challenge - a set of twin sentences involving pronoun reference disambiguation that seem to require the use of commonsense knowledge - was proposed by Hector Levesque in 2011. By 2019, a number of AI systems, based on large pre-trained transformer-based language models and fine-tuned on these kinds of problems, achieved better than 90% accuracy. In this paper, we review the history of the Winograd Schema Challenge and discuss the lasting contributions of the flurry of research that has taken place on the WSC in the last decade. We discuss the significance of various datasets developed for WSC, and the research community's deeper understanding of the role of surrogate tasks in assessing the intelligence of an AI system.

View on arXiv PDF

Similar