CLFeb 17, 2021

Metrical Tagging in the Wild: Building and Annotating Poetry Corpora with Rhythmic Features

arXiv:2102.08858v2804 citations
AI Analysis

This work addresses the problem of inconsistent and limited poetry corpora for computational literary studies, enabling large-scale analysis of poetic features, though it is incremental in applying existing neural methods to this domain.

The authors tackled the lack of consistent and annotated poetry corpora by building large English and German poetry datasets and annotating prosodic features to train neural models for robust analysis, showing that BiLSTM-CRF models with syllable embeddings outperform baseline and BERT-based approaches in tasks like predicting foot boundaries and caesuras.

A prerequisite for the computational study of literature is the availability of properly digitized texts, ideally with reliable meta-data and ground-truth annotation. Poetry corpora do exist for a number of languages, but larger collections lack consistency and are encoded in various standards, while annotated corpora are typically constrained to a particular genre and/or were designed for the analysis of certain linguistic features (like rhyme). In this work, we provide large poetry corpora for English and German, and annotate prosodic features in smaller corpora to train corpus driven neural models that enable robust large scale analysis. We show that BiLSTM-CRF models with syllable embeddings outperform a CRF baseline and different BERT-based approaches. In a multi-task setup, particular beneficial task relations illustrate the inter-dependence of poetic features. A model learns foot boundaries better when jointly predicting syllable stress, aesthetic emotions and verse measures benefit from each other, and we find that caesuras are quite dependent on syntax and also integral to shaping the overall measure of the line.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes