SE PLApr 5

COBOLAssist: Analyzing and Fixing Compilation Errors for LLM-Powered COBOL Code Generation

Anh T. V. Dau, Shin Hwei Tan, Jinqiu Yang, Nghi D. Q. Bui, Anh Tuan Nguyen

arXiv:2604.0397871.9

Predicted impact top 23% in SE · last 90 daysOriginality Incremental advance

AI Analysis

This addresses the challenge of maintaining legacy COBOL systems for businesses facing a shortage of skilled developers, though it is incremental as it builds on existing LLM techniques.

The paper tackled the problem of compilation errors in LLM-generated COBOL code by proposing COBOLAssist, a framework that uses iterative repairs guided by compilation feedback, which increased compilation success rates from 29.5% to 64.38% for GPT-4o-mini and from 41.8% to 95.89% for GPT-4o.

Legacy programming languages such as COBOL (Common Business-Oriented Language) remain critical in business computing. However, maintaining legacy COBOL systems is increasingly challenging due to a declining pool of skilled developers and the persistence of COBOL errors that require deep domain expertise to resolve. This paper investigates the challenges of COBOL compilation errors and introduces a framework leveraging large language models (LLMs) to address these issues. We first categorize the common compilation errors in LLM-generated COBOL code into three groups: incomplete code errors, syntax errors, and type-related errors. We further propose COBOLAssist, a technique to enhance code correctness through iterative repairs guided by compilation feedback. Our evaluation using five LLMs including GPT variants and mAInframer, shows a high prevalence of incorrect program structures and function usage in COBOL programs and demonstrates the effectiveness of COBOLAssist, with the compilation success rates increasing from 29.5\% to 64.38\% for GPT-4o-mini and from 41.8\% to 95.89\% for GPT-4o. It also improves pass@1 significantly, for example from 9.1 to 22.6 for GPT-4. Notably, while mAInframer-34B achieves the highest compilation success rate, its functional correctness remains limited. This research not only highlights the limitations in current LLMs for COBOL but also demonstrates a practical path forward for automated debugging in legacy systems.

View on arXiv PDF

Similar