SE AIOct 31, 2025

What a diff makes: automating code migration with large language models

Katherine A. Rosenfeld, Cliff C. Kerr, Jessica Lundin

arXiv:2511.00160v13.4h-index: 16Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge for software developers in managing dependency updates, though it is incremental as it builds on existing LLM methods for code tasks.

The paper tackles the problem of automating code migration to maintain compatibility with dependency updates using Large Language Models (LLMs), showing that contexts with diffs improve performance, achieving up to 80% correct identification of required changes in a real-world migration.

Modern software programs are built on stacks that are often undergoing changes that introduce updates and improvements, but may also break any project that depends upon them. In this paper we explore the use of Large Language Models (LLMs) for code migration, specifically the problem of maintaining compatibility with a dependency as it undergoes major and minor semantic version changes. We demonstrate, using metrics such as test coverage and change comparisons, that contexts containing diffs can significantly improve performance against out of the box LLMs and, in some cases, perform better than using code. We provide a dataset to assist in further development of this problem area, as well as an open-source Python package, AIMigrate, that can be used to assist with migrating code bases. In a real-world migration of TYPHOIDSIM between STARSIM versions, AIMigrate correctly identified 65% of required changes in a single run, increasing to 80% with multiple runs, with 47% of changes generated perfectly.

View on arXiv PDF

Similar