A Benchmark Arabic Dataset for Commonsense Explanation
This work addresses the under-researched problem of language comprehension and commonsense knowledge validation for Arabic text by providing a new benchmark dataset for researchers.
This paper introduces a benchmark Arabic dataset for commonsense explanation, consisting of Arabic sentences that lack sense, accompanied by three multiple-choice explanations for their falsehood. The authors also provide baseline results to facilitate future research and evaluation in this domain.
Language comprehension and commonsense knowledge validation by machines are challenging tasks that are still under researched and evaluated for Arabic text. In this paper, we present a benchmark Arabic dataset for commonsense explanation. The dataset consists of Arabic sentences that does not make sense along with three choices to select among them the one that explains why the sentence is false. Furthermore, this paper presents baseline results to assist and encourage the future evaluation of research in this field. The dataset is distributed under the Creative Commons CC-BY-SA 4.0 license and can be found on GitHub