SE CLApr 13

DuET: Dual Execution for Test Output Prediction with Generated Code and Pseudocode

Hojae Han, Jaejin Kim, Seung-won Hwang, Yu Jin Kim, Moontae Lee

arXiv:2604.1151423.8h-index: 6

Predicted impact top 20% in SE · last 90 daysOriginality Incremental advance

AI Analysis

For LLM-based test case generation, DuET provides a more reliable output prediction method by mitigating both code execution errors and pseudocode hallucination.

DuET combines direct code execution and LLM-based pseudocode execution via functional majority voting to improve test output prediction reliability. On LiveCodeBench, it achieves state-of-the-art performance, improving Pass@1 by 13.6 percentage points.

This work addresses test output prediction, a key challenge in test case generation. To improve the reliability of predicted outputs by LLMs, prior approaches generate code first to ground predictions. One grounding strategy is direct execution of generated code, but even minor errors can cause failures. To address this, we introduce LLM-based pseudocode execution, which grounds prediction on more error-resilient pseudocode and simulates execution via LLM reasoning. We further propose DuET, a dual-execution framework that combines both approaches by functional majority voting. Our analysis shows the two approaches are complementary in overcoming the limitations of direct execution suffering from code errors, and pseudocode reasoning from hallucination. On LiveCodeBench, DuET achieves the state-of-the-art performance, improving Pass@1 by 13.6 pp.

View on arXiv PDF

Similar