Unsupervised Explanation Generation via Correct Instantiations
This addresses the problem of explanation generation for AI systems, particularly in commonsense reasoning, but is incremental as it builds on existing unsupervised methods and benchmarks.
The paper tackles the challenge of generating explanations for why statements are wrong, such as against commonsense, by proposing Neon, an unsupervised two-phase framework that first generates corrected instantiations and then uses them to prompt large language models to find conflict points. Neon outperforms baselines on benchmarks like ComVE and e-SNLI in both automatic and human evaluations, demonstrating effectiveness in generalizing to different scenarios.
While large pre-trained language models (PLM) have shown their great skills at solving discriminative tasks, a significant gap remains when compared with humans for explanation-related tasks. Among them, explaining the reason why a statement is wrong (e.g., against commonsense) is incredibly challenging. The major difficulty is finding the conflict point, where the statement contradicts our real world. This paper proposes Neon, a two-phrase, unsupervised explanation generation framework. Neon first generates corrected instantiations of the statement (phase I), then uses them to prompt large PLMs to find the conflict point and complete the explanation (phase II). We conduct extensive experiments on two standard explanation benchmarks, i.e., ComVE and e-SNLI. According to both automatic and human evaluations, Neon outperforms baselines, even for those with human-annotated instantiations. In addition to explaining a negative prediction, we further demonstrate that Neon remains effective when generalizing to different scenarios.