CLNov 30, 2023

Introducing Rhetorical Parallelism Detection: A New Task with Datasets, Metrics, and Baselines

Stephen Bothwell, Justin DeBenedetto, Theresa Crnkovich, Hildegund Müller, David Chiang

arXiv:2312.00100v121.4134 citationsh-index: 3Has Code

Originality Incremental advance

AI Analysis

This work addresses a gap in natural language processing by formalizing and providing resources for parallelism detection, which is incremental as it establishes a new task with initial benchmarks.

The paper tackles the problem of detecting rhetorical parallelism in text, a stylistic tool involving juxtaposed phrases with similar linguistic features, by introducing a new task with formal definitions, datasets, metrics, and baseline systems, achieving F1 scores of 0.40 and 0.43 on Latin and Chinese datasets under the strictest metric.

Rhetoric, both spoken and written, involves not only content but also style. One common stylistic tool is $\textit{parallelism}$: the juxtaposition of phrases which have the same sequence of linguistic ($\textit{e.g.}$, phonological, syntactic, semantic) features. Despite the ubiquity of parallelism, the field of natural language processing has seldom investigated it, missing a chance to better understand the nature of the structure, meaning, and intent that humans convey. To address this, we introduce the task of $\textit{rhetorical parallelism detection}$. We construct a formal definition of it; we provide one new Latin dataset and one adapted Chinese dataset for it; we establish a family of metrics to evaluate performance on it; and, lastly, we create baseline systems and novel sequence labeling schemes to capture it. On our strictest metric, we attain $F_{1}$ scores of $0.40$ and $0.43$ on our Latin and Chinese datasets, respectively.

View on arXiv PDF Code

Similar