CL LGJun 26, 2023

The Art of Embedding Fusion: Optimizing Hate Speech Detection

Mohammad Aflah Khan, Neemesh Yadav, Mohit Jain, Sanyam Goyal

arXiv:2306.14939v21.37 citationsh-index: 10Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of optimizing hate speech detection for NLP practitioners, but it is incremental as it shows limited gains from embedding fusion.

The paper tackled the problem of effectively combining embeddings from multiple pre-trained language models for hate speech detection, finding that while combination leads to slight improvements, it comes with high computational cost and marginal impact from the specific method used.

Hate speech detection is a challenging natural language processing task that requires capturing linguistic and contextual nuances. Pre-trained language models (PLMs) offer rich semantic representations of text that can improve this task. However there is still limited knowledge about ways to effectively combine representations across PLMs and leverage their complementary strengths. In this work, we shed light on various combination techniques for several PLMs and comprehensively analyze their effectiveness. Our findings show that combining embeddings leads to slight improvements but at a high computational cost and the choice of combination has marginal effect on the final outcome. We also make our codebase public at https://github.com/aflah02/The-Art-of-Embedding-Fusion-Optimizing-Hate-Speech-Detection .

View on arXiv PDF Code

Similar