AIHCFeb 27, 2025

Evaluating Human Trust in LLM-Based Planners: A Preliminary Study

arXiv:2502.20284v12 citationsh-index: 52
Originality Synthesis-oriented
AI Analysis

This addresses trust issues for users adopting LLM-based planning systems, but it is a preliminary and incremental study.

The study tackled the problem of human trust in LLM-based planners compared to classical planners, finding that correctness is the main driver of trust and performance, with explanations improving evaluation accuracy but having limited impact on trust, and plan refinement increasing trust without significantly boosting accuracy.

Large Language Models (LLMs) are increasingly used for planning tasks, offering unique capabilities not found in classical planners such as generating explanations and iterative refinement. However, trust--a critical factor in the adoption of planning systems--remains underexplored in the context of LLM-based planning tasks. This study bridges this gap by comparing human trust in LLM-based planners with classical planners through a user study in a Planning Domain Definition Language (PDDL) domain. Combining subjective measures, such as trust questionnaires, with objective metrics like evaluation accuracy, our findings reveal that correctness is the primary driver of trust and performance. Explanations provided by the LLM improved evaluation accuracy but had limited impact on trust, while plan refinement showed potential for increasing trust without significantly enhancing evaluation accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes