CLApr 27, 2021

UoT-UWF-PartAI at SemEval-2021 Task 5: Self Attention Based Bi-GRU with Multi-Embedding Representation for Toxicity Highlighter

arXiv:2104.13164v1711 citations
Originality Incremental advance
AI Analysis

This work addresses the specific challenge of token-level toxicity detection for applications like content moderation, but it is incremental as it builds on existing methods.

The paper tackled the problem of detecting toxic spans at the token level in text, proposing a self-attention-based Bi-GRU model with multi-embedding representation, which achieved promising results in toxicity detection.

Toxic Spans Detection(TSD) task is defined as highlighting spans that make a text toxic. Many works have been done to classify a given comment or document as toxic or non-toxic. However, none of those proposed models work at the token level. In this paper, we propose a self-attention-based bidirectional gated recurrent unit(BiGRU) with a multi-embedding representation of the tokens. Our proposed model enriches the representation by a combination of GPT-2, GloVe, and RoBERTa embeddings, which led to promising results. Experimental results show that our proposed approach is very effective in detecting span tokens.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes