CLAug 16, 2017

Dialogue Act Segmentation for Vietnamese Human-Human Conversational Texts

arXiv:1708.04765v1
Originality Synthesis-oriented
AI Analysis

This work addresses dialog act segmentation for Vietnamese, a language-specific task useful for virtual assistants and chatbots, but it is incremental as it applies existing methods to a new domain.

The paper tackles the problem of segmenting Vietnamese conversational texts into dialog act boundaries, comparing traditional machine learning methods (maximum entropy and CRFs) with a deep learning approach (Bi-LSTM-CRF) on Facebook messages and phone conversation datasets, finding that the deep learning method performs appreciably better.

Dialog act identification plays an important role in understanding conversations. It has been widely applied in many fields such as dialogue systems, automatic machine translation, automatic speech recognition, and especially useful in systems with human-computer natural language dialogue interfaces such as virtual assistants and chatbots. The first step of identifying dialog act is identifying the boundary of the dialog act in utterances. In this paper, we focus on segmenting the utterance according to the dialog act boundaries, i.e. functional segments identification, for Vietnamese utterances. We investigate carefully functional segment identification in two approaches: (1) machine learning approach using maximum entropy (ME) and conditional random fields (CRFs); (2) deep learning approach using bidirectional Long Short-Term Memory (LSTM) with a CRF layer (Bi-LSTM-CRF) on two different conversational datasets: (1) Facebook messages (Message data); (2) transcription from phone conversations (Phone data). To the best of our knowledge, this is the first work that applies deep learning based approach to dialog act segmentation. As the results show, deep learning approach performs appreciably better as to compare with traditional machine learning approaches. Moreover, it is also the first study that tackles dialog act and functional segment identification for Vietnamese.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes