Do Code LLMs Understand Design Patterns?
This addresses a practical problem for software developers who must post-process LLM-generated code to meet project design standards.
The researchers investigated whether Code LLMs understand design patterns, finding that biases in these models significantly affect the reliability of downstream tasks like code generation and bug detection.
Code Large Language Models (LLMs) demonstrate great versatility in adapting to various downstream tasks, including code generation and completion, as well as bug detection and fixing. However, Code LLMs often fail to capture existing coding standards, leading to the generation of code that conflicts with the required design patterns for a given project. As a result, developers must post-process to adapt the generated code to the project's design norms. In this work, we empirically investigate the biases of Code LLMs in software development. Through carefully designed experiments, we assess the models' understanding of design patterns across recognition, comprehension, and generation. Our findings reveal that biases in Code LLMs significantly affect the reliability of downstream tasks.