Reinforced Graph of Thoughts: RL-Driven Adaptive Prompting for LLMs
For LLM users and researchers, RGoT automates and adapts the GoT prompting paradigm, reducing the need for manual solution-specific knowledge.
Reinforced Graph of Thoughts (RGoT) uses reinforcement learning to automatically generate adaptive graphs of operations for LLM prompting, overcoming the rigidity of manually defined graphs in Graph of Thoughts (GoT). Results show it can adaptively construct operation graphs to match task complexity.
Graph of Thoughts (GoT), a generalized form of recent prompting paradigms for large language models (LLMs), has been shown to be useful for elaborate problem solving. By executing a graph of operations, thoughts of the LLM are structured as an arbitrary graph, forming the actual graph of thoughts. Originally, the graph of operations is defined manually, which requires in-depth knowledge about the solution of the problem to solve. Such a static graph of operations is rigid and therefore lacks adaptability. We propose Reinforced Graph of Thoughts (RGoT), an automated approach to the GoT prompting paradigm that leverages reinforcement learning (RL) to adaptively generate a graph of operations from a human-defined set. Results indicate that, under certain constraints, it is possible to construct graphs of operations adaptively to the task's complexity in an automated way.