Annotate Rhetorical Relations with INCEpTION: A Comparison with Automatic Approaches
This work addresses the challenge of discourse parsing for NLP applications, but it is incremental as it applies existing methods to a specific domain (sports reports).
This research tackled the problem of annotating rhetorical relations in discourse by comparing manual annotation using INCEpTION with automatic approaches like BERT and DistilBERT on cricket news data, finding that DistilBERT achieved the highest accuracy for classification tasks.
This research explores the annotation of rhetorical relations in discourse using the INCEpTION tool and compares manual annotation with automatic approaches based on large language models. The study focuses on sports reports (specifically cricket news) and evaluates the performance of BERT, DistilBERT, and Logistic Regression models in classifying rhetorical relations such as elaboration, contrast, background, and cause-effect. The results show that DistilBERT achieved the highest accuracy, highlighting its potential for efficient discourse relation prediction. This work contributes to the growing intersection of discourse parsing and transformer-based NLP. (This paper was conducted as part of an academic requirement under the supervision of Prof. Dr. Ralf Klabunde, Linguistic Data Science Lab, Ruhr University Bochum.) Keywords: Rhetorical Structure Theory, INCEpTION, BERT, DistilBERT, Discourse Parsing, NLP.