5IDER: Unified Query Rewriting for Steering, Intent Carryover, Disfluencies, Entity Carryover and Repair
This work addresses the problem of improving multi-turn conversational abilities in voice assistants for users, representing an incremental advancement by combining existing tasks into a unified model with efficiency gains.
The paper tackles the challenge of enabling voice assistants to handle multi-turn conversations by addressing five conversational use-cases (steering, intent carryover, disfluencies, entity carryover, and repair) and their compositions, proposing a non-autoregressive query rewriting model that achieves competitive single-task performance and outperforms a fine-tuned T5 model in use-case compositions while being 15 times smaller and 25 times faster.
Providing voice assistants the ability to navigate multi-turn conversations is a challenging problem. Handling multi-turn interactions requires the system to understand various conversational use-cases, such as steering, intent carryover, disfluencies, entity carryover, and repair. The complexity of this problem is compounded by the fact that these use-cases mix with each other, often appearing simultaneously in natural language. This work proposes a non-autoregressive query rewriting architecture that can handle not only the five aforementioned tasks, but also complex compositions of these use-cases. We show that our proposed model has competitive single task performance compared to the baseline approach, and even outperforms a fine-tuned T5 model in use-case compositions, despite being 15 times smaller in parameters and 25 times faster in latency.