CLAISESep 29, 2023

Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency

arXiv:2309.17272v362 citationsh-index: 66
Originality Incremental advance
AI Analysis

This addresses the problem of unreliable code generation in large language models for developers and AI practitioners, offering an incremental improvement over existing verification methods.

The paper tackles the challenge of generating correct code in a single attempt by proposing the Multi-Perspective Self-Consistency framework, which improves performance on benchmarks like HumanEval (+15.91%), MBPP (+6.43%), and CodeContests (+9.37%) for models such as ChatGPT, even surpassing GPT-4.

Large language models (LLMs) have exhibited remarkable ability in code generation. However, generating the correct solution in a single attempt still remains a challenge. Prior works utilize verification properties in software engineering to verify and re-rank solutions in a majority voting manner. But the assumption behind them that generated verification properties have better qualities than solutions may not always hold. In this paper, we treat them equally as different perspectives of LLMs' reasoning processes. We propose the Multi-Perspective Self-Consistency (MPSC) framework incorporating both inter- and intra-consistency across outputs from multiple perspectives. Specifically, we prompt LLMs to generate diverse outputs from three perspectives, Solution, Specification and Test case, constructing a 3-partite graph. With two measure functions of consistency, we embed both inter- and intra-consistency information into the graph. The optimal choice of solutions is then determined based on analysis in the graph. MPSC significantly boosts performance of foundation models (ChatGPT in this paper) on various benchmarks, including HumanEval (+15.91%), MBPP (+6.43%) and CodeContests (+9.37%), even surpassing GPT-4.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes