AI CL HC LGJan 25, 2025

Feedback-Aware Monte Carlo Tree Search for Efficient Information Seeking in Goal-Oriented Conversations

arXiv:2501.15056v210 citationsh-index: 1

Originality Incremental advance

AI Analysis

This work addresses the challenge of efficient decision-making in conversational systems for domains like medical diagnosis and technical troubleshooting, representing an incremental advance with specific performance gains.

The paper tackles the problem of efficiently acquiring missing information in goal-oriented conversations by introducing a feedback-aware Monte Carlo Tree Search framework that leverages LLMs for question generation, achieving a 12% improvement in success rates and a 10x reduction in LLM calls compared to state-of-the-art methods.

Effective decision-making and problem-solving in conversational systems require the ability to identify and acquire missing information through targeted questioning. A key challenge lies in efficiently narrowing down a large space of possible outcomes by posing questions that minimize uncertainty. To address this, we introduce a novel framework that leverages Large Language Models (LLMs) to generate information-seeking questions, with Monte Carlo Tree Search (MCTS) to strategically select questions that maximize information gain, as a part of inference-time planning. Our primary contribution includes a hierarchical feedback mechanism that exploits past interaction patterns to guide future strategy. Specifically, each new problem is mapped to a cluster based on semantic similarity, and our UCT (Upper Confidence bound for Trees) formulation employs a cluster-specific bonus reward to prioritize successful question trajectories that have proven effective for similar problems in the past. Extensive empirical evaluation across medical diagnosis and technical troubleshooting domains shows that our method achieves an average of 12% improvement in success rates and about 10x reduction in the number of LLM calls made for planning per conversation, compared to the state of the art. An additional 8% gain in success rate is observed on average when we start with a constrained set of possibilities. Our results underscore the efficacy of feedback-aware MCTS in enhancing information-seeking in goal-oriented dialogues.

View on arXiv PDF

Similar