CLLGMar 7, 2023

CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification

CMUGeorgia Tech
arXiv:2303.03628v1272 citationsh-index: 22Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses a critical bottleneck in CoT prompting for researchers and practitioners by providing a tool to build datasets for fine-tuning models, though it is incremental as it focuses on data collection rather than a new method.

The authors tackled the problem of verifying the factual correctness of chain-of-thought explanations generated by large language models, introducing CoTEVer, a toolkit for annotating explanation correctness and collecting revision data to improve explanation faithfulness.

Chain-of-thought (CoT) prompting enables large language models (LLMs) to solve complex reasoning tasks by generating an explanation before the final prediction. Despite it's promising ability, a critical downside of CoT prompting is that the performance is greatly affected by the factuality of the generated explanation. To improve the correctness of the explanations, fine-tuning language models with explanation data is needed. However, there exists only a few datasets that can be used for such approaches, and no data collection tool for building them. Thus, we introduce CoTEVer, a tool-kit for annotating the factual correctness of generated explanations and collecting revision data of wrong explanations. Furthermore, we suggest several use cases where the data collected with CoTEVer can be utilized for enhancing the faithfulness of explanations. Our toolkit is publicly available at https://github.com/SeungoneKim/CoTEVer.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes