CLAISep 11, 2021

Empirical Analysis of Training Strategies of Transformer-based Japanese Chit-chat Systems

arXiv:2109.05217v159 citations
Originality Synthesis-oriented
AI Analysis

This addresses the lack of analysis on fine-tuning dataset impacts and extends Transformer-based approaches to Japanese, which is incremental for language-specific conversational AI.

The study developed large-scale Transformer-based Japanese dialogue models and datasets to examine their effectiveness for chit-chat systems, analyzing how fine-tuning datasets, model parameters, and additional information affect human dialogue impressions.

In recent years, several high-performance conversational systems have been proposed based on the Transformer encoder-decoder model. Although previous studies analyzed the effects of the model parameters and the decoding method on subjective dialogue evaluations with overall metrics, they did not analyze how the differences of fine-tuning datasets affect on user's detailed impression. In addition, the Transformer-based approach has only been verified for English, not for such languages with large inter-language distances as Japanese. In this study, we develop large-scale Transformer-based Japanese dialogue models and Japanese chit-chat datasets to examine the effectiveness of the Transformer-based approach for building chit-chat dialogue systems. We evaluated and analyzed the impressions of human dialogues in different fine-tuning datasets, model parameters, and the use of additional information.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes