CLMar 10, 2023

AUTODIAL: Efficient Asynchronous Task-Oriented Dialogue Model

Meta AIMILA
arXiv:2303.06245v3h-index: 13
Originality Incremental advance
AI Analysis

This work addresses deployment challenges for dialogue models in resource-constrained environments, offering a more efficient alternative to existing generative approaches.

The paper tackles the problem of high compute and memory requirements for deploying large dialogue models by introducing AUTODIAL, a multi-task model with parallel decoders that achieves 3-6x faster inference and uses 11x fewer parameters compared to SimpleTOD.

As large dialogue models become commonplace in practice, the problems surrounding high compute requirements for training, inference and larger memory footprint still persists. In this work, we present AUTODIAL, a multi-task dialogue model that addresses the challenges of deploying dialogue model. AUTODIAL utilizes parallel decoders to perform tasks such as dialogue act prediction, domain prediction, intent prediction, and dialogue state tracking. Using classification decoders over generative decoders allows AUTODIAL to significantly reduce memory footprint and achieve faster inference times compared to existing generative approach namely SimpleTOD. We demonstrate that AUTODIAL provides 3-6x speedups during inference while having 11x fewer parameters on three dialogue tasks compared to SimpleTOD. Our results show that extending current dialogue models to have parallel decoders can be a viable alternative for deploying them in resource-constrained environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes