CL IRDec 19, 2021

LUC at ComMA-2021 Shared Task: Multilingual Gender Biased and Communal Language Identification without using linguistic features

Rodrigo Cuéllar-Hidalgo, Julio de Jesús Guerrero-Zambrano, Dominic Forest, Gerardo Reyes-Salgado, Juan-Manuel Torres-Moreno

arXiv:2112.10189v129.3580 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the detection of harmful language in social media for content moderation, but it appears incremental as it applies existing methods without new data.

The paper tackled the problem of identifying aggressive, gender-biased, or communally charged language in social network documents using machine learning algorithms with probabilistic and vector space modeling methods, achieving results submitted to the ComMA@ICON'21 competition without specifying concrete numbers.

This work aims to evaluate the ability that both probabilistic and state-of-the-art vector space modeling (VSM) methods provide to well known machine learning algorithms to identify social network documents to be classified as aggressive, gender biased or communally charged. To this end, an exploratory stage was performed first in order to find relevant settings to test, i.e. by using training and development samples, we trained multiple algorithms using multiple vector space modeling and probabilistic methods and discarded the less informative configurations. These systems were submitted to the competition of the ComMA@ICON'21 Workshop on Multilingual Gender Biased and Communal Language Identification.

View on arXiv PDF

Similar