CLJun 1, 2023

Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts for Zero-Shot Dialogue State Tracking

Qingyue Wang, Liang Ding, Yanan Cao, Yibing Zhan, Zheng Lin, Shi Wang, Dacheng Tao, Li Guo

arXiv:2306.00434v126.7227 citationsh-index: 36

Originality Incremental advance

AI Analysis

This work addresses zero-shot transfer learning for task-oriented dialogue systems, enabling handling of diverse domains without costly in-domain data collection, though it is incremental as it builds on existing methods like T5-Adapter.

The paper tackles the problem of zero-shot dialogue state tracking by proposing a mixture-of-experts approach that disentangles semantics from seen data, achieving state-of-the-art performance on MultiWOZ2.1 without external knowledge and using only 10M trainable parameters.

Zero-shot transfer learning for Dialogue State Tracking (DST) helps to handle a variety of task-oriented dialogue domains without the cost of collecting in-domain data. Existing works mainly study common data- or model-level augmentation methods to enhance the generalization but fail to effectively decouple the semantics of samples, limiting the zero-shot performance of DST. In this paper, we present a simple and effective "divide, conquer and combine" solution, which explicitly disentangles the semantics of seen data, and leverages the performance and robustness with the mixture-of-experts mechanism. Specifically, we divide the seen data into semantically independent subsets and train corresponding experts, the newly unseen samples are mapped and inferred with mixture-of-experts with our designed ensemble inference. Extensive experiments on MultiWOZ2.1 upon the T5-Adapter show our schema significantly and consistently improves the zero-shot performance, achieving the SOTA on settings without external knowledge, with only 10M trainable parameters1.

View on arXiv PDF

Similar