E-KAR: A Benchmark for Rationalizing Natural Language Analogical Reasoning
This addresses the problem of evaluating explainable analogical reasoning in AI models, particularly for researchers in natural language processing, though it is incremental as it introduces a new benchmark rather than a novel method.
The authors tackled the lack of benchmarks that reveal the analogical reasoning process in neural models by proposing E-KAR, a first-of-its-kind Explainable Knowledge-intensive Analogical Reasoning benchmark with 1,655 Chinese and 1,251 English problems from Civil Service Exams, which proved very challenging for state-of-the-art models in both explanation generation and question answering tasks.
The ability to recognize analogies is fundamental to human cognition. Existing benchmarks to test word analogy do not reveal the underneath process of analogical reasoning of neural models. Holding the belief that models capable of reasoning should be right for the right reasons, we propose a first-of-its-kind Explainable Knowledge-intensive Analogical Reasoning benchmark (E-KAR). Our benchmark consists of 1,655 (in Chinese) and 1,251 (in English) problems sourced from the Civil Service Exams, which require intensive background knowledge to solve. More importantly, we design a free-text explanation scheme to explain whether an analogy should be drawn, and manually annotate them for each and every question and candidate answer. Empirical results suggest that this benchmark is very challenging for some state-of-the-art models for both explanation generation and analogical question answering tasks, which invites further research in this area.