CLAug 18, 2025

ding-01 :ARG0: An AMR Corpus for Spontaneous French Dialogue

arXiv:2508.12819v11 citationsh-index: 13
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of limited semantic resources for French dialogue, particularly for spontaneous speech, though it is incremental as it builds on existing AMR frameworks.

The authors tackled the lack of semantic resources for French dialogue by building a corpus of spontaneous French dialogues annotated in Abstract Meaning Representation (AMR), extending the AMR framework to better represent spontaneous speech and French-specific structures, and training an AMR parser on this data to assist in annotation.

We present our work to build a French semantic corpus by annotating French dialogue in Abstract Meaning Representation (AMR). Specifically, we annotate the DinG corpus, consisting of transcripts of spontaneous French dialogues recorded during the board game Catan. As AMR has insufficient coverage of the dynamics of spontaneous speech, we extend the framework to better represent spontaneous speech and sentence structures specific to French. Additionally, to support consistent annotation, we provide an annotation guideline detailing these extensions. We publish our corpus under a free license (CC-SA-BY). We also train and evaluate an AMR parser on our data. This model can be used as an assistance annotation tool to provide initial annotations that can be refined by human annotators. Our work contributes to the development of semantic resources for French dialogue.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes