CLApr 29

Comparative Analysis of AutoML and BiLSTM Models for Cyberbullying Detection on Indonesian Instagram Comments

Raihana Adelia Putri, Aisyah Musfirah, Anggi Puspita Ningrum, Luluk Muthoharoh, Ardika Satria, Martin Clinton Tosima Manullang

arXiv:2604.262293.3

AI Analysis

This is an incremental application of existing methods to a new language (Indonesian) and domain (Instagram comments) for cyberbullying detection.

The study compares AutoML and BiLSTM models for cyberbullying detection in Indonesian Instagram comments, finding that Logistic Regression performs best among ML models while BiLSTM with Attention achieves the strongest overall performance, with domain-specific preprocessing being key.

This study compares machine learning and deep learning approaches for cyberbullying detection in Indonesian-language Instagram comments. Using a balanced dataset of 650 comments labeled as Bullying and Non-Bullying, the study evaluates Naive Bayes, Logistic Regression, and Support Vector Machine with TF-IDF features, as well as BiLSTM and BiLSTM with Bahdanau Attention. A preprocessing pipeline tailored to informal Indonesian text is applied, including slang normalization, stopword removal, and stemming. The results show that Logistic Regression performs best among the machine learning models, while BiLSTM with Attention achieves the strongest overall deep learning performance. The findings highlight the value of domain-specific preprocessing and show that although deep learning captures contextual patterns more effectively, machine learning remains a competitive option for resource-constrained deployments.

View on arXiv PDF

Similar