CLAINov 9, 2023

Characterizing Large Language Models as Rationalizers of Knowledge-intensive Tasks

arXiv:2311.05085v232 citationsh-index: 10
AI Analysis

This addresses the problem of trustworthy AI explanations for users in knowledge-intensive domains, though it is incremental by building on existing few-shot methods.

The study investigated whether large language models (LLMs) can generate well-grounded rationales for knowledge-intensive tasks like commonsense multiple-choice questions, finding that crowd-workers preferred LLM-generated rationales over crowdsourced ones for factuality and sufficiency, but improvements in conciseness and novelty are needed.

Large language models (LLMs) are proficient at generating fluent text with minimal task-specific supervision. Yet, their ability to provide well-grounded rationalizations for knowledge-intensive tasks remains under-explored. Such tasks, like commonsense multiple-choice questions, require rationales based on world knowledge to support predictions and refute alternate options. We consider the task of generating knowledge-guided rationalization in natural language by using expert-written examples in a few-shot manner. Surprisingly, crowd-workers preferred knowledge-grounded rationales over crowdsourced rationalizations, citing their factuality, sufficiency, and comprehensive refutations. Although LLMs-generated rationales were preferable, further improvements in conciseness and novelty are required. In another study, we show how rationalization of incorrect model predictions erodes humans' trust in LLM-generated rationales. Motivated by these observations, we create a two-stage pipeline to review task predictions and eliminate potential incorrect decisions before rationalization, enabling trustworthy rationale generation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes