CL AINov 9, 2023

Characterizing Large Language Models as Rationalizers of Knowledge-intensive Tasks

Aditi Mishra, Sajjadur Rahman, Hannah Kim, Kushan Mitra, Estevam Hruschka

arXiv:2311.05085v211.132 citationsh-index: 10

Originality Incremental advance

AI Analysis

This addresses the problem of trustworthy AI explanations for users in knowledge-intensive domains, though it is incremental by building on existing few-shot methods.

The study investigated whether large language models (LLMs) can generate well-grounded rationales for knowledge-intensive tasks like commonsense multiple-choice questions, finding that crowd-workers preferred LLM-generated rationales over crowdsourced ones for factuality and sufficiency, but improvements in conciseness and novelty are needed.

Large language models (LLMs) are proficient at generating fluent text with minimal task-specific supervision. Yet, their ability to provide well-grounded rationalizations for knowledge-intensive tasks remains under-explored. Such tasks, like commonsense multiple-choice questions, require rationales based on world knowledge to support predictions and refute alternate options. We consider the task of generating knowledge-guided rationalization in natural language by using expert-written examples in a few-shot manner. Surprisingly, crowd-workers preferred knowledge-grounded rationales over crowdsourced rationalizations, citing their factuality, sufficiency, and comprehensive refutations. Although LLMs-generated rationales were preferable, further improvements in conciseness and novelty are required. In another study, we show how rationalization of incorrect model predictions erodes humans' trust in LLM-generated rationales. Motivated by these observations, we create a two-stage pipeline to review task predictions and eliminate potential incorrect decisions before rationalization, enabling trustworthy rationale generation.

View on arXiv PDF

Similar