Code Hallucination
This addresses the issue of unreliable code generation for developers using AI copilots, though it is incremental as it focuses on demonstrating and triggering existing hallucination types.
The paper tackles the problem of code hallucination in large language models used for code generation, presenting HallTrigger, a technique that efficiently triggers hallucinations in blackbox models without needing access to their internals, with results showing it is effective and highlighting the significant impact on software development.
Generative models such as large language models are extensively used as code copilots and for whole program generation. However, the programs they generate often have questionable correctness, authenticity and reliability in terms of integration as they might not follow the user requirements, provide incorrect and/or nonsensical outputs, or even contain semantic/syntactic errors - overall known as LLM hallucination. In this work, we present several types of code hallucination. We have generated such hallucinated code manually using large language models. We also present a technique - HallTrigger, in order to demonstrate efficient ways of generating arbitrary code hallucination. Our method leverages 3 different dynamic attributes of LLMs to craft prompts that can successfully trigger hallucinations from models without the need to access model architecture or parameters. Results from popular blackbox models suggest that HallTrigger is indeed effective and the pervasive LLM hallucination have sheer impact on software development.