CLJun 3, 2025

TO-GATE: Clarifying Questions and Summarizing Responses with Trajectory Optimization for Eliciting Human Preference

arXiv:2506.02827v12 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work addresses the challenge of eliciting human preferences more effectively through multi-turn dialogue, offering a domain-specific improvement for AI assistants and interactive systems.

The paper tackles the problem of inefficient dialogue trajectories in LLM-based human preference elicitation by proposing TO-GATE, a framework that uses trajectory optimization to generate optimal clarifying questions and task-aligned summaries, resulting in a 9.32% improvement over baselines on standard tasks.

Large language models (LLMs) can effectively elicit human preferences through multi-turn dialogue. Complex tasks can be accomplished through iterative clarifying questions and final responses generated by an LLM acting as a questioner (STaR-GATE; Andukuri et al., 2024}). However, existing approaches based on self-taught reasoning struggle to identify optimal dialogue trajectories and avoid irrelevant questions to the tasks. To address this limitation, we propose TO-GATE, a novel framework that enhances question generation through trajectory optimization, which consists of two key components: a clarification resolver that generates optimal questioning trajectories, and a summarizer that ensures task-aligned final responses. The trajectory optimization enables the model to produce effective elicitation questions and summary responses tailored to specific tasks. Experimental results demonstrate that TO-GATE significantly outperforms baseline methods, achieving a 9.32% improvement on standard preference elicitation tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes