HiMAP-Travel: Hierarchical Multi-Agent Planning for Long-Horizon Constrained Travel
This work provides a solution for improving long-horizon planning capabilities of LLM agents, particularly for applications requiring adherence to strict constraints like travel planning, which benefits users needing reliable automated planning.
The paper addresses the failure of sequential LLM agents in long-horizon planning with hard constraints by proposing HiMAP-Travel, a hierarchical multi-agent framework. This framework achieves a 52.78% validation and 52.65% test Final Pass Rate on TravelPlanner, outperforming sequential DeepTravel by +8.67 percentage points and reducing latency by 2.5x.
Sequential LLM agents fail on long-horizon planning with hard constraints like budgets and diversity requirements. As planning progresses and context grows, these agents drift from global constraints. We propose HiMAP-Travel, a hierarchical multi-agent framework that splits planning into strategic coordination and parallel day-level execution. A Coordinator allocates resources across days, while Day Executors plan independently in parallel. Three key mechanisms enable this: a transactional monitor enforcing budget and uniqueness constraints across parallel agents, a bargaining protocol allowing agents to reject infeasible sub-goals and trigger re-planning, and a single policy trained with GRPO that powers all agents through role conditioning. On TravelPlanner, HiMAP-Travel with Qwen3-8B achieves 52.78% validation and 52.65% test Final Pass Rate (FPR). In a controlled comparison with identical model, training, and tools, it outperforms the sequential DeepTravel baseline by +8.67~pp. It also surpasses ATLAS by +17.65~pp and MTP by +10.0~pp. On FlexTravelBench multi-turn scenarios, it achieves 44.34% (2-turn) and 37.42% (3-turn) FPR while reducing latency 2.5x through parallelization.