CLLGJun 26, 2023

The Art of Embedding Fusion: Optimizing Hate Speech Detection

arXiv:2306.14939v27 citationsh-index: 10Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of optimizing hate speech detection for NLP practitioners, but it is incremental as it shows limited gains from embedding fusion.

The paper tackled the problem of effectively combining embeddings from multiple pre-trained language models for hate speech detection, finding that while combination leads to slight improvements, it comes with high computational cost and marginal impact from the specific method used.

Hate speech detection is a challenging natural language processing task that requires capturing linguistic and contextual nuances. Pre-trained language models (PLMs) offer rich semantic representations of text that can improve this task. However there is still limited knowledge about ways to effectively combine representations across PLMs and leverage their complementary strengths. In this work, we shed light on various combination techniques for several PLMs and comprehensively analyze their effectiveness. Our findings show that combining embeddings leads to slight improvements but at a high computational cost and the choice of combination has marginal effect on the final outcome. We also make our codebase public at https://github.com/aflah02/The-Art-of-Embedding-Fusion-Optimizing-Hate-Speech-Detection .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes