CLAILGFeb 4, 2024

Synergy-of-Thoughts: Eliciting Efficient Reasoning in Hybrid Language Models

arXiv:2402.02563v410 citationsh-index: 34
Originality Incremental advance
AI Analysis

This addresses the cost barrier for deploying LLMs in real-world applications, particularly for open-ended tasks, though it is an incremental improvement over existing reasoning methods.

The paper tackles the high API cost of large language models (LLMs) in reasoning tasks by proposing Synergy of Thoughts (SoT), a hybrid framework that uses smaller models for intuitive thoughts and larger models for reflective reasoning when conflicts arise, reducing API costs by 38.3%-75.1% while achieving state-of-the-art accuracy and solution diversity.

Large language models (LLMs) have shown impressive emergent abilities in a wide range of tasks, but the associated expensive API cost greatly limits the real application. Previous works like chain-of-thought (CoT) and tree-of-thoughts (ToT) have predominately focused on enhancing accuracy, but overlook the rapidly increasing API cost, which could be particularly problematic for open-ended real-world tasks with huge solution spaces. Motivated by the dual process theory of human cognition, we propose "Synergy of Thoughts"(SoT) to unleash the synergistic potential of hybrid LLMs with different scales for efficient reasoning. By default, SoT uses smaller-scale language models to generate multiple low-cost intuitive thoughts, which resembles the parallel intuitions produced by System 1. We then design a confidence evaluator where the intuitive thoughts are cross-evaluated and introduce a controllable threshold mechanism to decide their mutual conflict. If these intuitive thoughts exhibit conflicts, SoT will invoke the reflective reasoning of scaled-up language models to emulate the intervention of System 2, which will override the intuitive thoughts and rectify the reasoning results. This framework is model-agnostic and training-free, which can be flexibly implemented with various off-the-shelf LLMs. Experiments on six representative reasoning tasks show that SoT substantially reduces the API cost by 38.3%-75.1%, and simultaneously achieves state-of-the-art reasoning accuracy and solution diversity. Notably, the average token cost reduction on open-ended tasks reaches up to 69.1%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes