CLAug 27, 2020

Improvement of a dedicated model for open domain persona-aware dialogue generation

arXiv:2008.11970v1Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses efficiency and effectiveness for dialogue systems, but it is incremental as it applies known Transformer improvements to a specific model.

The paper tackles improving the speed and performance of a Transformer-based model for open domain persona-aware dialogue generation, achieving unspecified gains on a dataset of multi-turn short dialogues with sequences up to 105 tokens.

This paper analyzes some speed and performance improvement methods of Transformer architecture in recent years, mainly its application in dedicated model training. The dedicated model studied here refers to the open domain persona-aware dialogue generation model, and the dataset is multi turn short dialogue, The total length of a single input sequence is no more than 105 tokens. Therefore, many improvements in the architecture and attention mechanism of transformer architecture for long sequence processing are not discussed in this paper. The source code of the experiments has been open sourced: https://github.com/ghosthamlet/persona

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes