Specification and Detection of LLM Code Smells
This addresses software quality issues for developers integrating LLMs into systems, but it is incremental as it extends existing detection tools.
The paper tackles the lack of formal catalog for problematic coding practices in LLM inference by introducing and formalizing five LLM code smells, and it finds that these smells affect 60.50% of analyzed systems with a detection precision of 86.06%.
Large Language Models (LLMs) have gained massive popularity in recent years and are increasingly integrated into software systems for diverse purposes. However, poorly integrating them in source code may undermine software system quality. Yet, to our knowledge, there is no formal catalog of code smells specific to coding practices for LLM inference. In this paper, we introduce the concept of LLM code smells and formalize five recurrent problematic coding practices related to LLM inference in software systems, based on relevant literature. We extend the detection tool SpecDetect4AI to cover the newly defined LLM code smells and use it to validate their prevalence in a dataset of 200 open-source LLM systems. Our results show that LLM code smells affect 60.50% of the analyzed systems, with a detection precision of 86.06%.