StyleKQC: A Style-Variant Paraphrase Corpus for Korean Questions and Commands
This addresses a problem for developers of Korean dialog systems by providing a domain-specific resource for controlled style conversion, though it is incremental as it builds on existing paraphrasing methods.
The paper tackled the lack of style-variant paraphrasing for Korean questions and commands by constructing a corpus that considers intent and formality, achieving adequate classification and inference performance with conventional fine-tuning approaches.
Paraphrasing is often performed with less concern for controlled style conversion. Especially for questions and commands, style-variant paraphrasing can be crucial in tone and manner, which also matters with industrial applications such as dialog systems. In this paper, we attack this issue with a corpus construction scheme that simultaneously considers the core content and style of directives, namely intent and formality, for the Korean language. Utilizing manually generated natural language queries on six daily topics, we expand the corpus to formal and informal sentences by human rewriting and transferring. We verify the validity and industrial applicability of our approach by checking the adequate classification and inference performance that fit with conventional fine-tuning approaches, at the same time proposing a supervised formality transfer task.