CLApr 8, 2024

RoT: Enhancing Large Language Models with Reflection on Search Trees

arXiv:2404.05449v314 citationsh-index: 27
Originality Incremental advance
AI Analysis

This addresses inefficiencies in LLM reasoning for AI researchers, though it is incremental as it builds on existing tree-search methods.

The paper tackles the problem of large language models repeating mistakes in tree-search-based prompting by introducing RoT, a reflection framework that uses a strong LLM to generate guidelines from past search experiences, improving performance on reasoning and planning tasks with concrete gains over baseline methods.

Large language models (LLMs) have demonstrated impressive capability in reasoning and planning when integrated with tree-search-based prompting methods. However, since these methods ignore the previous search experiences, they often make the same mistakes in the search process. To address this issue, we introduce Reflection on search Trees (RoT), an LLM reflection framework designed to improve the performance of tree-search-based prompting methods. It uses a strong LLM to summarize guidelines from previous tree search experiences to enhance the ability of a weak LLM. The guidelines are instructions about solving this task through tree search which can prevent the weak LLMs from making similar mistakes in the past search process. In addition, we proposed a novel state selection method, which identifies the critical information from historical search processes to help RoT generate more specific and meaningful guidelines. In our extensive experiments, we find that RoT significantly improves the performance of LLMs in reasoning or planning tasks with various tree-search-based prompting methods (e.g., BFS and MCTS). Non-tree-search-based prompting methods such as Chain-of-Thought (CoT) can also benefit from RoT guidelines since RoT can provide task-specific knowledge collected from the search experience.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes