PyResBugs: A Dataset of Residual Python Bugs for Natural Language-Driven Fault Injection
This provides a resource for researchers to advance AI-driven automated testing in Python, addressing a domain-specific issue.
The paper tackles the problem of residual bugs in Python systems by introducing PyResBugs, a curated dataset with multi-level natural language descriptions, enabling natural language-driven fault injection to simulate real-world faults.
This paper presents PyResBugs, a curated dataset of residual bugs, i.e., defects that persist undetected during traditional testing but later surface in production, collected from major Python frameworks. Each bug in the dataset is paired with its corresponding fault-free (fixed) version and annotated with multi-level natural language (NL) descriptions. These NL descriptions enable natural language-driven fault injection, offering a novel approach to simulating real-world faults in software systems. By bridging the gap between software fault injection techniques and real-world representativeness, PyResBugs provides researchers with a high-quality resource for advancing AI-driven automated testing in Python systems.