SEAILGDec 17, 2025

Imitation Game: Reproducing Deep Learning Bugs Leveraging an Intelligent Agent

arXiv:2512.14990v31 citations
Originality Incremental advance
AI Analysis

This addresses a critical problem for developers and researchers in deep learning by automating bug reproduction, though it is an incremental improvement over existing methods.

The paper tackles the challenge of reproducing deep learning bugs, which are difficult due to nondeterminism and environmental dependencies, by introducing RepGen, an automated approach using an LLM that achieves an 80.19% reproduction rate on real-world bugs, improving over the state-of-the-art by 19.81%.

Despite their wide adoption in various domains (e.g., healthcare, finance, software engineering), Deep Learning (DL)-based applications suffer from many bugs, failures, and vulnerabilities. Reproducing these bugs is essential for their resolution, but it is extremely challenging due to the inherent nondeterminism of DL models and their tight coupling with hardware and software environments. According to recent studies, only about 3% of DL bugs can be reliably reproduced using manual approaches. To address these challenges, we present RepGen, a novel, automated, and intelligent approach for reproducing deep learning bugs. RepGen constructs a learning-enhanced context from a project, develops a comprehensive plan for bug reproduction, employs an iterative generate-validate-refine mechanism, and thus generates such code using an LLM that reproduces the bug at hand. We evaluate RepGen on 106 real-world deep learning bugs and achieve a reproduction rate of 80.19%, a 19.81% improvement over the state-of-the-art measure. A developer study involving 27 participants shows that RepGen improves the success rate of DL bug reproduction by 23.35%, reduces the time to reproduce by 56.8%, and lowers participants' cognitive load.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes