Not Your Grandfathers Test Set: Reducing Labeling Effort for Testing
This addresses the costly and labor-intensive process of test set creation and maintenance for practitioners in QA, encouraging a fundamental rethinking of testing approaches.
The paper tackles the problem of high labeling effort for building and maintaining test sets, which often drift from production traffic, by proposing a technique that reduces labeling effort by 80-100% across practical scenarios.
Building and maintaining high-quality test sets remains a laborious and expensive task. As a result, test sets in the real world are often not properly kept up to date and drift from the production traffic they are supposed to represent. The frequency and severity of this drift raises serious concerns over the value of manually labeled test sets in the QA process. This paper proposes a simple but effective technique that drastically reduces the effort needed to construct and maintain a high-quality test set (reducing labeling effort by 80-100% across a range of practical scenarios). This result encourages a fundamental rethinking of the testing process by both practitioners, who can use these techniques immediately to improve their testing, and researchers who can help address many of the open questions raised by this new approach.