CLMay 31, 2023

FEED PETs: Further Experimentation and Expansion on the Disambiguation of Potentially Euphemistic Terms

Patrick Lee, Iyanuoluwa Shode, Alain Chirino Trujillo, Yuan Zhao, Olumide Ebenezer Ojo, Diana Cuevas Plancarte, Anna Feldman, Jing Peng

arXiv:2306.00217v226.3222 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of euphemism disambiguation for NLP researchers, but it is incremental as it builds on existing transformer-based methods with new data and minor extensions.

The study tackled the task of euphemism disambiguation by expanding it to include vagueness annotation and multilingual corpora, finding that transformers perform better on vague terms and establishing preliminary results in Yoruba, Spanish, and Mandarin Chinese.

Transformers have been shown to work well for the task of English euphemism disambiguation, in which a potentially euphemistic term (PET) is classified as euphemistic or non-euphemistic in a particular context. In this study, we expand on the task in two ways. First, we annotate PETs for vagueness, a linguistic property associated with euphemisms, and find that transformers are generally better at classifying vague PETs, suggesting linguistic differences in the data that impact performance. Second, we present novel euphemism corpora in three different languages: Yoruba, Spanish, and Mandarin Chinese. We perform euphemism disambiguation experiments in each language using multilingual transformer models mBERT and XLM-RoBERTa, establishing preliminary results from which to launch future work.

View on arXiv PDF

Similar