Testing the Effect of Code Documentation on Large Language Model Code Understanding
This addresses a practical issue for developers using LLMs in code-related tasks, but it is incremental as it builds on existing research on LLM capabilities.
The paper tackles the problem of how code documentation affects LLM code understanding, showing that incorrect documentation greatly hinders it, while incomplete or missing documentation does not significantly affect it.
Large Language Models (LLMs) have demonstrated impressive abilities in recent years with regards to code generation and understanding. However, little work has investigated how documentation and other code properties affect an LLM's ability to understand and generate code or documentation. We present an empirical analysis of how underlying properties of code or documentation can affect an LLM's capabilities. We show that providing an LLM with "incorrect" documentation can greatly hinder code understanding, while incomplete or missing documentation does not seem to significantly affect an LLM's ability to understand code.