CLMay 31, 2023

FEED PETs: Further Experimentation and Expansion on the Disambiguation of Potentially Euphemistic Terms

arXiv:2306.00217v2222 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of euphemism disambiguation for NLP researchers, but it is incremental as it builds on existing transformer-based methods with new data and minor extensions.

The study tackled the task of euphemism disambiguation by expanding it to include vagueness annotation and multilingual corpora, finding that transformers perform better on vague terms and establishing preliminary results in Yoruba, Spanish, and Mandarin Chinese.

Transformers have been shown to work well for the task of English euphemism disambiguation, in which a potentially euphemistic term (PET) is classified as euphemistic or non-euphemistic in a particular context. In this study, we expand on the task in two ways. First, we annotate PETs for vagueness, a linguistic property associated with euphemisms, and find that transformers are generally better at classifying vague PETs, suggesting linguistic differences in the data that impact performance. Second, we present novel euphemism corpora in three different languages: Yoruba, Spanish, and Mandarin Chinese. We perform euphemism disambiguation experiments in each language using multilingual transformer models mBERT and XLM-RoBERTa, establishing preliminary results from which to launch future work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes