RST-style Discourse Parsing Guided by Document-level Content Structures
This work addresses a specific bottleneck in natural language processing for discourse analysis, offering an incremental enhancement to existing parsing methods.
The paper tackled the problem of low performance in predicting discourse relations for large text spans in Rhetorical Structure Theory-based Discourse Parsing by incorporating document-level content structures, resulting in promising improvements across various parsing metrics.
Rhetorical Structure Theory based Discourse Parsing (RST-DP) explores how clauses, sentences, and large text spans compose a whole discourse and presents the rhetorical structure as a hierarchical tree. Existing RST parsing pipelines construct rhetorical structures without the knowledge of document-level content structures, which causes relatively low performance when predicting the discourse relations for large text spans. Recognizing the value of high-level content-related information in facilitating discourse relation recognition, we propose a novel pipeline for RST-DP that incorporates structure-aware news content sentence representations derived from the task of News Discourse Profiling. By incorporating only a few additional layers, this enhanced pipeline exhibits promising performance across various RST parsing metrics.