CL AI LGMar 31, 2022

Bangla hate speech detection on social media using attention-based recurrent neural network

Amit Kumar Das, Abdullah Al Asif, Anik Paul, Md. Nur Hossain

arXiv:2203.16775v12.6118 citations

Originality Incremental advance

AI Analysis

This addresses the problem of detecting hate speech in Bengali, a widely used but under-researched language, with incremental improvements in accuracy and interpretability over existing methods.

The paper tackled hate speech detection in Bengali social media comments by proposing an encoder-decoder model with attention mechanisms, achieving a best accuracy of 77% on a dataset of 7,425 comments across seven categories.

Hate speech has spread more rapidly through the daily use of technology and, most notably, by sharing your opinions or feelings on social media in a negative aspect. Although numerous works have been carried out in detecting hate speeches in English, German, and other languages, very few works have been carried out in the context of the Bengali language. In contrast, millions of people communicate on social media in Bengali. The few existing works that have been carried out need improvements in both accuracy and interpretability. This article proposed encoder decoder based machine learning model, a popular tool in NLP, to classify user's Bengali comments on Facebook pages. A dataset of 7,425 Bengali comments, consisting of seven distinct categories of hate speeches, was used to train and evaluate our model. For extracting and encoding local features from the comments, 1D convolutional layers were used. Finally, the attention mechanism, LSTM, and GRU based decoders have been used for predicting hate speech categories. Among the three encoder decoder algorithms, the attention-based decoder obtained the best accuracy (77%).

View on arXiv PDF

Similar