GumDrop at the DISRPT2019 Shared Task: A Model Stacking Approach to Discourse Unit Segmentation and Connective Detection
This work addresses discourse parsing tasks for natural language processing researchers, presenting an incremental improvement through model stacking.
The paper tackled the problem of automatic discourse unit segmentation and connective detection by developing GumDrop, a model stacking approach that uses heterogeneous ensembles of classifiers feeding into a metalearner for each task, achieving generalization across datasets of varying sizes and homogeneity.
In this paper we present GumDrop, Georgetown University's entry at the DISRPT 2019 Shared Task on automatic discourse unit segmentation and connective detection. Our approach relies on model stacking, creating a heterogeneous ensemble of classifiers, which feed into a metalearner for each final task. The system encompasses three trainable component stacks: one for sentence splitting, one for discourse unit segmentation and one for connective detection. The flexibility of each ensemble allows the system to generalize well to datasets of different sizes and with varying levels of homogeneity.