HiLDe: Intentional Code Generation via Human-in-the-Loop Decoding
This addresses the problem of software security vulnerabilities caused by over-reliance on AI tools for programmers, though it is incremental as it builds on existing human-in-the-loop concepts.
The paper tackled the problem of AI programming tools excluding users from decision-making, leading to over-reliance and security risks, by proposing Human-in-the-loop Decoding to allow users to influence LLM decisions during code generation. The result was that HiLDe, an implementation of this technique, led participants to generate significantly fewer vulnerabilities and better align code with their goals in a study with 18 participants on security tasks.
While AI programming tools hold the promise of increasing programmers' capabilities and productivity to a remarkable degree, they often exclude users from essential decision-making processes, causing many to effectively "turn off their brains" and over-rely on solutions provided by these systems. These behaviors can have severe consequences in critical domains, like software security. We propose Human-in-the-loop Decoding, a novel interaction technique that allows users to observe and directly influence LLM decisions during code generation, in order to align the model's output with their personal requirements. We implement this technique in HiLDe, a code completion assistant that highlights critical decisions made by the LLM and provides local alternatives for the user to explore. In a within-subjects study (N=18) on security-related tasks, we found that HiLDe led participants to generate significantly fewer vulnerabilities and better align code generation with their goals compared to a traditional code completion assistant.