PLJan 22

RECURSUM: Automated Code Generation for Recurrence Relations Exceeds Expert Optimization via LayeredCodegen

arXiv:2604.18585h-index: 1
AI Analysis

For scientists and engineers using recurrence relations in fields like quantum chemistry and numerical analysis, this work eliminates the need for dual expertise in domain knowledge and C++ metaprogramming, enabling automated generation that exceeds manual optimization.

RECURSUM is a Python DSL that generates optimized C++ for recurrence relations, achieving 9.8x speedup over expert hand-written code and 1.9x over template metaprogramming for McMurchie-Davidson Hermite coefficients, with speedups propagating to complete algorithms within 3.3% of expert baselines.

Automated code generation can systematically exceed expert hand-optimization for recurrence relations-computational primitives ubiquitous in orthogonal polynomials, special functions, numerical integration, and molecular integral evaluation. We present RECURSUM, a Python-based domain-specific language generating optimized C++ for arbitrary recurrence relations via three backends: template metaprogramming for compile-time evaluation, a novel LayeredCodegen backend with architectural optimizations, and runtime loop-based evaluation. The DSL uses einsum-inspired notation to specify recurrences, validity constraints, and base cases in 10-30 lines of Python, generating 650+ lines of production C++. LayeredCodegen achieves 9.8x speedup over expert hand-written implementations and 1.9x over template metaprogramming for McMurchie-Davidson Hermite coefficients. Architecture analysis reveals three quantifiable effects: (1) zero-copy output parameters eliminate return-by-value overhead (70-80% of speedup), (2) guaranteed function inlining eliminates compiler-refused overhead (15-20%), (3) exact-sized stack buffers achieve 100% cache efficiency vs 27% for MAX-sized arrays (5-10%). We validate on 24 recurrence types spanning pure mathematics (Legendre, Chebyshev, Hermite, Laguerre polynomials), numerical analysis (Clenshaw, Golub-Welsch), and quantum chemistry (McMurchie-Davidson, Rys quadrature, Boys function). Production benchmarks show speedups propagate to complete algorithms, with generated code matching expert baselines within 3.3%. RECURSUM demonstrates that systematic code generation serves as the performance ceiling for recurrence algorithms. By eliminating the dual expertise barrier (domain knowledge + C++ metaprogramming), the framework democratizes high-performance scientific computing-establishing a paradigm where automated generation systematically exceeds manual optimization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes