Fast Few-shot Debugging for NLU Test Suites
This work addresses debugging inefficiencies in NLU models for practitioners, but it is incremental as it builds on existing test suite and fast debugging approaches.
The paper tackles the problem of few-shot debugging for transformer-based NLU models using test suites, aiming to correct issues with minimal impact on original performance. They introduce a fast method that samples in-danger examples, achieving superior original accuracy compared to other fast methods while maintaining comparable debugging accuracy.
We study few-shot debugging of transformer based natural language understanding models, using recently popularized test suites to not just diagnose but correct a problem. Given a few debugging examples of a certain phenomenon, and a held-out test set of the same phenomenon, we aim to maximize accuracy on the phenomenon at a minimal cost of accuracy on the original test set. We examine several methods that are faster than full epoch retraining. We introduce a new fast method, which samples a few in-danger examples from the original training set. Compared to fast methods using parameter distance constraints or Kullback-Leibler divergence, we achieve superior original accuracy for comparable debugging accuracy.