On Sarcasm Detection with OpenAI GPT-based Models
It addresses the challenging problem of sarcasm detection for NLP applications, but the approach is incremental as it applies existing GPT models to a known dataset.
This paper tackled sarcasm detection in natural language by testing various GPT models, including fine-tuned and zero-shot approaches, on the SARC 2.0 dataset, with the largest fine-tuned GPT-3 model achieving an accuracy and F1-score of 0.81, outperforming prior models.
Sarcasm is a form of irony that requires readers or listeners to interpret its intended meaning by considering context and social cues. Machine learning classification models have long had difficulty detecting sarcasm due to its social complexity and contradictory nature. This paper explores the applications of the Generative Pretrained Transformer (GPT) models, including GPT-3, InstructGPT, GPT-3.5, and GPT-4, in detecting sarcasm in natural language. It tests fine-tuned and zero-shot models of different sizes and releases. The GPT models were tested on the political and balanced (pol-bal) portion of the popular Self-Annotated Reddit Corpus (SARC 2.0) sarcasm dataset. In the fine-tuning case, the largest fine-tuned GPT-3 model achieves accuracy and $F_1$-score of 0.81, outperforming prior models. In the zero-shot case, one of GPT-4 models yields an accuracy of 0.70 and $F_1$-score of 0.75. Other models score lower. Additionally, a model's performance may improve or deteriorate with each release, highlighting the need to reassess performance after each release.