Mapping the Dialog Act Annotations of the LEGO Corpus into the Communicative Functions of ISO 24617-2
This work provides more annotated data for dialog research using a widely explored corpus, but it is incremental as it focuses on mapping existing annotations to a new standard.
The paper tackles the problem of limited data annotated with the recent ISO 24617-2 standard by mapping dialog act annotations from the LEGO corpus to this standard, resulting in 347 additional annotated dialogs.
In this paper we present strategies for mapping the dialog act annotations of the LEGO corpus into the communicative functions of the ISO 24617-2 standard. Using these strategies, we obtained an additional 347 dialogs annotated according to the standard. This is particularly important given the reduced amount of existing data in those conditions due to the recency of the standard. Furthermore, these are dialogs from a widely explored corpus for dialog related tasks. However, its dialog annotations have been neglected due to their high domain-dependency, which renders them unuseful outside the context of the corpus. Thus, through our mapping process, we both obtain more data annotated according to a recent standard and provide useful dialog act annotations for a widely explored corpus in the context of dialog research.