CLSep 18, 2024

Enabling Real-Time Conversations with Minimal Training Costs

Wang Xu, Shuo Wang, Weilin Zhao, Xu Han, Yukun Yan, Yudi Zhang, Zhe Tao, Zhiyuan Liu, Wanxiang Che

Tsinghua

arXiv:2409.11727v16.19 citationsh-index: 44

Originality Incremental advance

AI Analysis

This work addresses the need for more efficient real-time dialogue systems in AI applications, though it appears incremental as it builds on existing duplex models.

The paper tackles the problem of enabling real-time conversations with large language models (LLMs) by introducing a duplex decoding approach that requires minimal additional training, resulting in significantly enhanced naturalness and human-likeness in user-AI interactions.

Large language models (LLMs) have demonstrated the ability to improve human efficiency through conversational interactions. Conventional LLM-powered dialogue systems, operating on a turn-based paradigm, preclude real-time interaction during response generation. To address this limitation, researchers have proposed duplex models. These models can dynamically adapt to user input, facilitating real-time interactive feedback. However, these methods typically require substantial computational resources to acquire the ability. To reduce overhead, this paper presents a new duplex decoding approach that enhances LLMs with duplex ability, requiring minimal additional training. Specifically, our method employs parallel decoding of queries and responses in conversations, effectively implementing a channel-division-multiplexing decoding strategy. Experimental results indicate that our proposed method significantly enhances the naturalness and human-likeness of user-AI interactions with minimal training costs.

View on arXiv PDF

Similar