CL AIAug 25, 2024

Path-Consistency with Prefix Enhancement for Efficient Inference in LLMs

Jiace Zhu, Yuanzhe Huang, Yingtao Shen, Jie Zhao, An Zou

arXiv:2409.01281v310.013 citationsh-index: 4

Originality Incremental advance

AI Analysis

This addresses the problem of slow and expensive inference for users of LLMs in reasoning applications, offering an incremental improvement over existing self-consistency techniques.

The paper tackles the computational inefficiency of self-consistency methods in large language models by introducing path-consistency, which uses confidence in early answers to guide generation, reducing inference latency by up to 40.5% while maintaining accuracy across reasoning tasks.

To enhance the reasoning capabilities of large language models (LLMs), self-consistency has become a popular approach, combining multiple samplings with majority voting. However, current methods are computationally expensive and time-consuming due to the need for numerous samplings. To address this, this paper introduces path-consistency, which leverages the confidence of earlier-generated answers to identify the most promising prefix and guide the generation of subsequent branches. By dynamically guiding the generation of subsequent branches based on this prefix, path-consistency mitigates both the errors and redundancies from random or less useful sampling in self-consistency. This approach reduces errors and redundancies from random sampling, significantly accelerating inference by minimizing token consumption. Our extensive empirical results demonstrate that path-consistency improves inference latency by up to 40.5\%, while maintaining task accuracy across various tasks, including mathematical reasoning, commonsense reasoning, and symbolic reasoning.

View on arXiv PDF

Similar