Instruction Fusion: Advancing Prompt Evolution through Hybridization
This addresses the problem of enhancing code generation in LLMs for developers and researchers, but it appears incremental as it builds on existing prompt evolution techniques.
The paper tackles performance limitations in prompt evolution for code generation LLMs by introducing Instruction Fusion, a method that hybridizes two prompts, and reports significant improvements across five benchmarks including HumanEval and MBPP.
The fine-tuning of Large Language Models (LLMs) specialized in code generation has seen notable advancements through the use of open-domain coding queries. Despite the successes, existing methodologies like Evol-Instruct encounter performance limitations, impeding further enhancements in code generation tasks. This paper examines the constraints of existing prompt evolution techniques and introduces a novel approach, Instruction Fusion (IF). IF innovatively combines two distinct prompts through a hybridization process, thereby enhancing the evolution of training prompts for code LLMs. Our experimental results reveal that the proposed novel method effectively addresses the shortcomings of prior methods, significantly improving the performance of Code LLMs across five code generation benchmarks, namely HumanEval, HumanEval+, MBPP, MBPP+ and MultiPL-E, which underscore the effectiveness of Instruction Fusion in advancing the capabilities of LLMs in code generation.