Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning and Coding with LLMs
This addresses cost and efficiency issues for users of large language models in reasoning and coding tasks, though it is an incremental improvement over existing Self-Consistency methods.
The paper tackles the inefficiency of constant sampling in Self-Consistency for LLMs by introducing Adaptive-Consistency, which dynamically adjusts sample counts per question, reducing sample budget by up to 7.9 times with minimal accuracy loss (less than 0.1% drop).
A popular approach for improving the correctness of output from large language models (LLMs) is Self-Consistency - poll the LLM multiple times and output the most frequent solution. Existing Self-Consistency techniques always generate a constant number of samples per question, where a better approach will be to non-uniformly distribute the available budget based on the amount of agreement in the samples generated so far. In response, we introduce Adaptive-Consistency, a cost-efficient, model-agnostic technique that dynamically adjusts the number of samples per question using a lightweight stopping criterion. Our experiments over 17 reasoning and code generation datasets and three LLMs demonstrate that Adaptive-Consistency reduces sample budget by up to 7.9 times with an average accuracy drop of less than 0.1%. Our code and data are available at https://www.sample-step-by-step.info