CLJan 30, 2024

Detecting Racist Text in Bengali: An Ensemble Deep Learning Framework

arXiv:2401.16748v11 citationsh-index: 12023 26th International Conference on Computer and Information Technology (ICCIT)

Originality Synthesis-oriented

AI Analysis

This addresses the issue of online racism in Bengali social media, though it is incremental as it applies existing methods to a new language-specific dataset.

The paper tackles the problem of detecting racist text in Bengali by building a novel dataset and applying deep learning models, achieving an accuracy of 87.94% using an ensemble approach.

Racism is an alarming phenomenon in our country as well as all over the world. Every day we have come across some racist comments in our daily life and virtual life. Though we can eradicate this racism from virtual life (such as Social Media). In this paper, we have tried to detect those racist comments with NLP and deep learning techniques. We have built a novel dataset in the Bengali Language. Further, we annotated the dataset and conducted data label validation. After extensive utilization of deep learning methodologies, we have successfully achieved text detection with an impressive accuracy rate of 87.94\% using the Ensemble approach. We have applied RNN and LSTM models using BERT Embeddings. However, the MCNN-LSTM model performed highest among all those models. Lastly, the Ensemble approach has been followed to combine all the model results to increase overall performance.

View on arXiv PDF

Similar