CLIRDec 19, 2021

LUC at ComMA-2021 Shared Task: Multilingual Gender Biased and Communal Language Identification without using linguistic features

arXiv:2112.10189v1580 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the detection of harmful language in social media for content moderation, but it appears incremental as it applies existing methods without new data.

The paper tackled the problem of identifying aggressive, gender-biased, or communally charged language in social network documents using machine learning algorithms with probabilistic and vector space modeling methods, achieving results submitted to the ComMA@ICON'21 competition without specifying concrete numbers.

This work aims to evaluate the ability that both probabilistic and state-of-the-art vector space modeling (VSM) methods provide to well known machine learning algorithms to identify social network documents to be classified as aggressive, gender biased or communally charged. To this end, an exploratory stage was performed first in order to find relevant settings to test, i.e. by using training and development samples, we trained multiple algorithms using multiple vector space modeling and probabilistic methods and discarded the less informative configurations. These systems were submitted to the competition of the ComMA@ICON'21 Workshop on Multilingual Gender Biased and Communal Language Identification.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes