CL LGJan 13, 2025

ESURF: Simple and Effective EDU Segmentation

Mohammadreza Sediqin, Shlomo Engelson Argamon

arXiv:2501.07723v11 citationsh-index: 36

Originality Incremental advance

AI Analysis

This addresses the fundamental task of EDU segmentation for discourse parsing, offering a potentially more training-efficient approach, though it appears incremental as it builds on existing feature-based methods.

The paper tackled the problem of segmenting text into Elemental Discourse Units (EDUs) by introducing a simple method using lexical and character n-gram features with random forest classification, and it outperformed other methods in segmentation and within a state-of-the-art discourse parser.

Segmenting text into Elemental Discourse Units (EDUs) is a fundamental task in discourse parsing. We present a new simple method for identifying EDU boundaries, and hence segmenting them, based on lexical and character n-gram features, using random forest classification. We show that the method, despite its simplicity, outperforms other methods both for segmentation and within a state of the art discourse parser. This indicates the importance of such features for identifying basic discourse elements, pointing towards potentially more training-efficient methods for discourse analysis.

View on arXiv PDF

Similar