Self-Consistency from Only Two Samples: CoT-PoT Ensembling for Efficient LLM Reasoning
This work addresses the high computational cost of self-consistency in LLM reasoning, offering a practical efficiency gain for practitioners.
The authors propose a hybrid ensembling method combining Chain-of-Thought and Program-of-Thought reasoning for self-consistency, reducing the required number of samples by 9.3x and achieving 78.6% of tasks with only two samples.
Self-consistency (SC) is a popular technique for improving the reasoning accuracy of large language models by aggregating multiple sampled outputs, but it comes at a high computational cost due to extensive sampling. We introduce a hybrid ensembling approach that leverages the complementary strengths of two distinct modes of reasoning: Chain-of-Thought (CoT) and Program-of-Thought (PoT). We describe a general framework for combining these two forms of reasoning in self-consistency, as well as particular strategies for both full sampling and early-stopping. We show that CoT-PoT ensembling not only improves overall accuracy, but also drastically reduces the number of samples required for SC by a factor of 9.3x. In particular, the majority of tasks (78.6%) can be addressed with only two samples, which has not been possible with any prior SC methods.