Mixture of Reasonings: Teach Large Language Models to Reason with Adaptive Strategies
This addresses the adaptability and efficiency limitations in LLM reasoning for users needing robust performance across diverse tasks without manual prompt engineering, though it is incremental as it builds on existing prompting techniques.
The paper tackles the problem of large language models' reliance on manually crafted, task-specific prompts for reasoning by introducing Mixture of Reasoning (MoR), a training framework that embeds diverse reasoning strategies into LLMs for autonomous, task-adaptive reasoning, resulting in performance improvements such as MoR150 achieving 0.730 (2.2% improvement) with CoT prompting and 0.734 (13.5% improvement) compared to baselines.
Large language models (LLMs) excel in complex tasks through advanced prompting techniques like Chain-of-Thought (CoT) and Tree-of-Thought (ToT), but their reliance on manually crafted, task-specific prompts limits adaptability and efficiency. We introduce Mixture of Reasoning (MoR), a training framework that embeds diverse reasoning strategies into LLMs for autonomous, task-adaptive reasoning without external prompt engineering. MoR has two phases: Thought Generation, creating reasoning chain templates with models like GPT-4o, and SFT Dataset Construction, pairing templates with benchmark datasets for supervised fine-tuning. Our experiments show that MoR significantly enhances performance, with MoR150 achieving 0.730 (2.2% improvement) using CoT prompting and 0.734 (13.5% improvement) compared to baselines. MoR eliminates the need for task-specific prompts, offering a generalizable solution for robust reasoning across diverse tasks.