SEAIMay 9, 2025

PyResBugs: A Dataset of Residual Python Bugs for Natural Language-Driven Fault Injection

arXiv:2505.05777v12 citationsh-index: 12Has Code2025 IEEE/ACM Second International Conference on AI Foundation Models and Software Engineering (Forge)
Originality Synthesis-oriented
AI Analysis

This provides a resource for researchers to advance AI-driven automated testing in Python, addressing a domain-specific issue.

The paper tackles the problem of residual bugs in Python systems by introducing PyResBugs, a curated dataset with multi-level natural language descriptions, enabling natural language-driven fault injection to simulate real-world faults.

This paper presents PyResBugs, a curated dataset of residual bugs, i.e., defects that persist undetected during traditional testing but later surface in production, collected from major Python frameworks. Each bug in the dataset is paired with its corresponding fault-free (fixed) version and annotated with multi-level natural language (NL) descriptions. These NL descriptions enable natural language-driven fault injection, offering a novel approach to simulating real-world faults in software systems. By bridging the gap between software fault injection techniques and real-world representativeness, PyResBugs provides researchers with a high-quality resource for advancing AI-driven automated testing in Python systems.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes