CLOct 13, 2021

SGD-X: A Benchmark for Robust Generalization in Schema-Guided Dialogue Systems

Harrison Lee, Raghav Gupta, Abhinav Rastogi, Yuan Cao, Bin Zhang, Yonghui Wu

arXiv:2110.06800v34.940 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of robust generalization for unseen services in dialogue systems, but it is incremental as it builds on the existing SGD dataset.

The authors tackled the problem of zero/few-shot transfer in task-oriented dialogue systems by creating SGD-X, a benchmark to test robustness to linguistic variations in schemas, and found that top models fail to generalize well, with improvements shown through a data augmentation method.

Zero/few-shot transfer to unseen services is a critical challenge in task-oriented dialogue research. The Schema-Guided Dialogue (SGD) dataset introduced a paradigm for enabling models to support any service in zero-shot through schemas, which describe service APIs to models in natural language. We explore the robustness of dialogue systems to linguistic variations in schemas by designing SGD-X - a benchmark extending SGD with semantically similar yet stylistically diverse variants for every schema. We observe that two top state tracking models fail to generalize well across schema variants, measured by joint goal accuracy and a novel metric for measuring schema sensitivity. Additionally, we present a simple model-agnostic data augmentation method to improve schema robustness.

View on arXiv PDF Code

Similar