The Art of Embedding Fusion: Optimizing Hate Speech Detection
This work addresses the challenge of optimizing hate speech detection for NLP practitioners, but it is incremental as it shows limited gains from embedding fusion.
The paper tackled the problem of effectively combining embeddings from multiple pre-trained language models for hate speech detection, finding that while combination leads to slight improvements, it comes with high computational cost and marginal impact from the specific method used.
Hate speech detection is a challenging natural language processing task that requires capturing linguistic and contextual nuances. Pre-trained language models (PLMs) offer rich semantic representations of text that can improve this task. However there is still limited knowledge about ways to effectively combine representations across PLMs and leverage their complementary strengths. In this work, we shed light on various combination techniques for several PLMs and comprehensively analyze their effectiveness. Our findings show that combining embeddings leads to slight improvements but at a high computational cost and the choice of combination has marginal effect on the final outcome. We also make our codebase public at https://github.com/aflah02/The-Art-of-Embedding-Fusion-Optimizing-Hate-Speech-Detection .