Code Evolution Graphs: Understanding Large Language Model Driven Design of Algorithms
This work addresses the lack of interpretability in LLM-driven algorithm design, which is a problem for researchers and practitioners in automated code generation and evolutionary computation, though it is incremental in nature.
The paper tackles the problem of understanding and analyzing the code generation process of Large Language Models (LLMs) in evolutionary frameworks, where optimization can stall or produce non-competitive algorithms. It presents a novel approach that allows users to analyze generated code evolution, showing that repeated prompting increases code complexity, which can hurt performance, and that different LLMs have distinct coding styles, suggesting multi-LLM use might improve results.
Large Language Models (LLMs) have demonstrated great promise in generating code, especially when used inside an evolutionary computation framework to iteratively optimize the generated algorithms. However, in some cases they fail to generate competitive algorithms or the code optimization stalls, and we are left with no recourse because of a lack of understanding of the generation process and generated codes. We present a novel approach to mitigate this problem by enabling users to analyze the generated codes inside the evolutionary process and how they evolve over repeated prompting of the LLM. We show results for three benchmark problem classes and demonstrate novel insights. In particular, LLMs tend to generate more complex code with repeated prompting, but additional complexity can hurt algorithmic performance in some cases. Different LLMs have different coding ``styles'' and generated code tends to be dissimilar to other LLMs. These two findings suggest that using different LLMs inside the code evolution frameworks might produce higher performing code than using only one LLM.