OutFlip: Generating Out-of-Domain Samples for Unknown Intent Detection with Natural Language Attack
This addresses the challenge of handling unsupported inputs in dialogue systems, though it appears incremental as it adapts an existing attack method for a new purpose.
The paper tackles the problem of out-of-domain input detection in task-oriented dialogue systems by proposing OutFlip, a method that automatically generates out-of-domain samples from in-domain training data, resulting in significant improvement in detection performance.
Out-of-domain (OOD) input detection is vital in a task-oriented dialogue system since the acceptance of unsupported inputs could lead to an incorrect response of the system. This paper proposes OutFlip, a method to generate out-of-domain samples using only in-domain training dataset automatically. A white-box natural language attack method HotFlip is revised to generate out-of-domain samples instead of adversarial examples. Our evaluation results showed that integrating OutFlip-generated out-of-domain samples into the training dataset could significantly improve an intent classification model's out-of-domain detection performance.