RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge
This dataset addresses the lack of resources for evaluating advanced natural language understanding abilities, specifically linguistic creativity and higher-order commonsense reasoning, for the NLU research community.
This paper introduces RiddleSense, a new multiple-choice question answering task and dataset of 5.7k examples, designed to evaluate complex commonsense reasoning, figurative language understanding, and counterfactual reasoning in riddle-style questions. Their evaluation shows a significant gap between the best-supervised model and human performance.
Question: I have five fingers but I am not alive. What am I? Answer: a glove. Answering such a riddle-style question is a challenging cognitive process, in that it requires complex commonsense reasoning abilities, an understanding of figurative language, and counterfactual reasoning skills, which are all important abilities for advanced natural language understanding (NLU). However, there are currently no dedicated datasets aiming to test these abilities. Herein, we present RiddleSense, a new multiple-choice question answering task, which comes with the first large dataset (5.7k examples) for answering riddle-style commonsense questions. We systematically evaluate a wide range of models over the challenge, and point out that there is a large gap between the best-supervised model and human performance -- suggesting intriguing future research in the direction of higher-order commonsense reasoning and linguistic creativity towards building advanced NLU systems.