CLJun 1, 2023

Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts for Zero-Shot Dialogue State Tracking

arXiv:2306.00434v1227 citationsh-index: 36
Originality Incremental advance
AI Analysis

This work addresses zero-shot transfer learning for task-oriented dialogue systems, enabling handling of diverse domains without costly in-domain data collection, though it is incremental as it builds on existing methods like T5-Adapter.

The paper tackles the problem of zero-shot dialogue state tracking by proposing a mixture-of-experts approach that disentangles semantics from seen data, achieving state-of-the-art performance on MultiWOZ2.1 without external knowledge and using only 10M trainable parameters.

Zero-shot transfer learning for Dialogue State Tracking (DST) helps to handle a variety of task-oriented dialogue domains without the cost of collecting in-domain data. Existing works mainly study common data- or model-level augmentation methods to enhance the generalization but fail to effectively decouple the semantics of samples, limiting the zero-shot performance of DST. In this paper, we present a simple and effective "divide, conquer and combine" solution, which explicitly disentangles the semantics of seen data, and leverages the performance and robustness with the mixture-of-experts mechanism. Specifically, we divide the seen data into semantically independent subsets and train corresponding experts, the newly unseen samples are mapped and inferred with mixture-of-experts with our designed ensemble inference. Extensive experiments on MultiWOZ2.1 upon the T5-Adapter show our schema significantly and consistently improves the zero-shot performance, achieving the SOTA on settings without external knowledge, with only 10M trainable parameters1.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes