43.1QUANT-PHMay 19Code
QUTest: A Native Testing Framework for Quantum ProgramsJosé Campos
Quantum programs are often shared as OpenQASM 3 circuits, but tests are still written in host languages such as Python with Qiskit. We present QUTest, a native framework in which both programs and tests are standard .qasm files. Tests follow the Arrange / Act / Assert pattern, while configuration, runtime requirements, and assertions are encoded as pragma comments (//%), preserving compatibility with existing OpenQASM tools. QUTest provides 12 assertion types spanning deterministic, statistical, quantum-state, and structural checks, plus a linter and an environment-aware mode for running the same test across selected runtime versions in isolated environments. Its CLI supports automatic test discovery, runtime compatibility checks, and XML reports for continuous integration. We describe the pragma language, implementation, and a planned evaluation using coverage and mutation testing. QUTest is available at https://github.com/QBugs/qutest. Video demo: https://youtu.be/FvgvsiAXuW0.
7.2SEApr 12
MutDafny: A Mutation-Based Approach to Assess Dafny SpecificationsIsabel Amaral, Alexandra Mendes, José Campos
In verification-aware languages, such as Dafny, despite their critical role, specifications are as prone to error as implementations. Flaws in specifications can result in formally verified programs that deviate from the intended behavior. In this paper, we explore the use of mutation testing to reveal weaknesses in formal specifications written in Dafny. We present MutDafny, a tool that increases the reliability of Dafny specifications by automatically signaling potential weaknesses. Using a mutation testing approach, we introduce faults (mutations) into the code and rely on formal specifications for detecting them. If a program with a mutant verifies, this may indicate a weakness in the specification. We extensively analyze mutation operators from popular tools, identifying the ones applicable to Dafny. In addition, we synthesize new operators tailored for the language from bugfix commits in publicly available Dafny projects on GitHub. Drawing from both, we equipped our tool with a total of 40 mutation operators. We evaluate MutDafny's effectiveness and efficiency on a dataset of 794 real-world Dafny programs, and manually analyze a subset of the resulting undetected mutants, identifying five weak real-world specifications (on average, one at every 241 lines of code) that would benefit from strengthening.
80.7QUANT-PHApr 29
Probabilistic Condition, Decision and Path Coverage of Circuit-based Quantum ProgramsDaniel Fortunato, José Campos, Rui Abreu
Coverage criteria play a central role in assessing test adequacy in classical software, yet their effectiveness for quantum programs remains poorly understood and largely unexplored. In this paper, we propose six quantum-tailored criteria - condition, decision, and path coverage, and their probabilistic variants - adapted from their classical counterparts. We present QaCoCo, a tool that computes these criteria for circuit-based quantum programs. We empirically evaluate these criteria on a large and diverse set of 540 circuits and analyze the coverage achieved. Our results show that while circuits frequently achieve high condition and decision coverage (97.56% and 97.63%, on average), path coverage remains limited (71.84%), particularly in the presence of multi-controlled gates, which induce extreme path explosion and coverage imbalance. Moreover, to account for the probabilistic nature of quantum circuits, we introduce probabilistic coverage, which augments structural coverage with a confidence measure (88.87%, 88.65%, and 37.18% for condition, decision, and path coverage, respectively, on average). Finally, through mutation testing, we find weak or no correlation between fault detection and structural coverage, consistent with observations in classical computing.
SEMar 31, 2021
QBugs: A Collection of Reproducible Bugs in Quantum Algorithms and a Supporting Infrastructure to Enable Controlled Quantum Software Testing and Debugging ExperimentsJosé Campos, André Souto
Reproducibility and comparability of empirical results are at the core tenet of the scientific method in any scientific field. To ease reproducibility of empirical studies, several benchmarks in software engineering research, such as Defects4J, have been developed and widely used. For quantum software engineering research, however, no benchmark has been established yet. In this position paper, we propose a new benchmark -- named QBugs -- which will provide experimental subjects and an experimental infrastructure to ease the evaluation of new research and the reproducibility of previously published results on quantum software engineering.