SEAIApr 21, 2025

LLM-Assisted Translation of Legacy FORTRAN Codes to C++: A Cross-Platform Study

arXiv:2504.15424v114 citationsh-index: 7Proceedings of the 1st Workshop on AI and Scientific Discovery: Directions and Opportunities
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of modernizing legacy HPC codes for domain experts, but it is incremental as it focuses on evaluation rather than introducing new methods.

The study tackled the problem of translating legacy FORTRAN codes to C++ using Large Language Models (LLMs) by assessing compilation accuracy, code similarity to human translations, and output similarity, finding statistically quantified results for these metrics.

Large Language Models (LLMs) are increasingly being leveraged for generating and translating scientific computer codes by both domain-experts and non-domain experts. Fortran has served as one of the go to programming languages in legacy high-performance computing (HPC) for scientific discoveries. Despite growing adoption, LLM-based code translation of legacy code-bases has not been thoroughly assessed or quantified for its usability. Here, we studied the applicability of LLM-based translation of Fortran to C++ as a step towards building an agentic-workflow using open-weight LLMs on two different computational platforms. We statistically quantified the compilation accuracy of the translated C++ codes, measured the similarity of the LLM translated code to the human translated C++ code, and statistically quantified the output similarity of the Fortran to C++ translation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes