CL LGMay 7, 2023

Stanford MLab at SemEval-2023 Task 10: Exploring GloVe- and Transformer-Based Methods for the Explainable Detection of Online Sexism

Hee Jung Choi, Trevor Chow, Aaron Wan, Hong Meng Yam, Swetha Yogeswaran, Beining Zhou

arXiv:2305.04356v126.1222 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the detection of online sexism for content moderation, but it is incremental as it applies existing methods to a specific competition task.

The paper tackled the problem of detecting and explaining online sexism by performing classification tasks to predict sexist texts and categorize them into subcategories, exploring methods like GloVe embeddings, transformer models (BERT, RoBERTa, DeBERTa), ensembles, and blending, with pre-training yielding significant performance improvements and ensembles slightly boosting robustness in F1 scores.

In this paper, we discuss the methods we applied at SemEval-2023 Task 10: Towards the Explainable Detection of Online Sexism. Given an input text, we perform three classification tasks to predict whether the text is sexist and classify the sexist text into subcategories in order to provide an additional explanation as to why the text is sexist. We explored many different types of models, including GloVe embeddings as the baseline approach, transformer-based deep learning models like BERT, RoBERTa, and DeBERTa, ensemble models, and model blending. We explored various data cleaning and augmentation methods to improve model performance. Pre-training transformer models yielded significant improvements in performance, and ensembles and blending slightly improved robustness in the F1 score.

View on arXiv PDF

Similar