CLAIMar 17, 2022

EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training

Tsinghua
arXiv:2203.09313v359 citationsh-index: 74Has Code
Originality Incremental advance
AI Analysis

This work addresses the development of human-like chatbots for Chinese users, but it is incremental as it builds on existing pre-training approaches with specific optimizations.

The paper tackles the problem of building open-domain Chinese dialogue systems by investigating key factors like data quality and model design, resulting in EVA2.0, a 2.8 billion parameter model that significantly outperforms other open-source models in evaluations.

Large-scale pre-training has shown remarkable performance in building open-domain dialogue systems. However, previous works mainly focus on showing and evaluating the conversational performance of the released dialogue model, ignoring the discussion of some key factors towards a powerful human-like chatbot, especially in Chinese scenarios. In this paper, we conduct extensive experiments to investigate these under-explored factors, including data quality control, model architecture designs, training approaches, and decoding strategies. We propose EVA2.0, a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters, and will make our models and codes publicly available. Automatic and human evaluations show that EVA2.0 significantly outperforms other open-source counterparts. We also discuss the limitations of this work by presenting some failure cases and pose some future research directions on large-scale Chinese open-domain dialogue systems.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes