Rethinking Code Complexity Through the Lens of Large Language Models
For researchers and practitioners using LLMs on code, this work identifies a fundamental mismatch with existing metrics and provides a new metric that better predicts LLM difficulty.
Traditional code complexity metrics do not correlate with LLM performance; the proposed LM-CC metric, based on semantic nonlinearity, shows strong partial correlations with LLM performance and improves downstream tasks when reduced.
Code complexity metrics such as cyclomatic complexity have long been used to assess software quality and maintainability. With the rapid advancement of large language models (LLMs) on coding tasks, an important yet underexplored question arises: do traditional complexity metrics meaningfully characterize the coding difficulty that LLMs perceive? In this work, we empirically demonstrate that classical complexity metrics exhibit no consistent correlation with LLM performance, revealing a fundamental mismatch with model-perceived difficulty. To address this gap, we propose LM-CC, a novel code complexity metric tailored for LLMs, grounded in the hypothesis that model-perceived code difficulty is fundamentally driven by semantic nonlinearity. LM-CC quantifies complexity through an entropy-guided semantic compositional hierarchy, capturing the cumulative uncertainty encountered by LLMs during code understanding. Our experimental results demonstrate that LM-CC exhibits strong and consistent partial correlations with LLM performance, while semantics-preserving reductions in LM-CC consistently lead to improved downstream task performance. The source code is available at: https://github.com/xchen121/lm-cc.