Detecting Toxic Language: Ontology and BERT-based Approaches for Bulgarian Text
This work addresses the challenge of detecting toxic language in Bulgarian online platforms while preserving essential information like medical terms, though it is incremental as it applies existing methods to a new language.
The paper tackles toxic content detection in Bulgarian text by developing an ontology and a BERT-based model trained on a manually annotated dataset of 4,384 sentences, achieving a 0.89 F1 macro score.
Toxic content detection in online communication remains a significant challenge, with current solutions often inadvertently blocking valuable information, including medical terms and text related to minority groups. This paper presents a more nu-anced approach to identifying toxicity in Bulgarian text while preserving access to essential information. The research explores two distinct methodologies for detecting toxic content. The developed methodologies have po-tential applications across diverse online platforms and content moderation systems. First, we propose an ontology that models the potentially toxic words in Bulgarian language. Then, we compose a dataset that comprises 4,384 manually anno-tated sentences from Bulgarian online forums across four categories: toxic language, medical terminology, non-toxic lan-guage, and terms related to minority communities. We then train a BERT-based model for toxic language classification, which reaches a 0.89 F1 macro score. The trained model is directly applicable in a real environment and can be integrated as a com-ponent of toxic content detection systems.