CLMay 21, 2023

A Pilot Study on Dialogue-Level Dependency Parsing for Chinese

arXiv:2305.12441v2222 citations
Originality Synthesis-oriented
AI Analysis

This addresses a gap in Chinese dialogue parsing, which is incremental as it builds on existing syntactic treebanks and methods.

The paper tackles dialogue-level dependency parsing for Chinese by creating a human-annotated corpus of 850 dialogues with 199,803 dependencies and developing methods for zero-shot and few-shot scenarios using signal-based transformation and data selection, showing effective baseline results.

Dialogue-level dependency parsing has received insufficient attention, especially for Chinese. To this end, we draw on ideas from syntactic dependency and rhetorical structure theory (RST), developing a high-quality human-annotated corpus, which contains 850 dialogues and 199,803 dependencies. Considering that such tasks suffer from high annotation costs, we investigate zero-shot and few-shot scenarios. Based on an existing syntactic treebank, we adopt a signal-based method to transform seen syntactic dependencies into unseen ones between elementary discourse units (EDUs), where the signals are detected by masked language modeling. Besides, we apply single-view and multi-view data selection to access reliable pseudo-labeled instances. Experimental results show the effectiveness of these baselines. Moreover, we discuss several crucial points about our dataset and approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes