CL LGJan 16, 2023

TEDB System Description to a Shared Task on Euphemism Detection 2022

arXiv:2301.06602v1292 citationsh-index: 6

Originality Synthesis-oriented

AI Analysis

This work addresses euphemism detection for natural language processing applications, but it is incremental as it applies existing methods to a shared task.

The paper tackled euphemism detection as a text classification problem using Transformer-based models, achieving a best F1-score of 0.816 with a fine-tuned RoBERTa model combined with a KimCNN classifier.

In this report, we describe our Transformers for euphemism detection baseline (TEDB) submissions to a shared task on euphemism detection 2022. We cast the task of predicting euphemism as text classification. We considered Transformer-based models which are the current state-of-the-art methods for text classification. We explored different training schemes, pretrained models, and model architectures. Our best result of 0.816 F1-score (0.818 precision and 0.814 recall) consists of a euphemism-detection-finetuned TweetEval/TimeLMs-pretrained RoBERTa model as a feature extractor frontend with a KimCNN classifier backend trained end-to-end using a cosine annealing scheduler. We observed pretrained models on sentiment analysis and offensiveness detection to correlate with more F1-score while pretraining on other tasks, such as sarcasm detection, produces less F1-scores. Also, putting more word vector channels does not improve the performance in our experiments.

View on arXiv PDF

Similar