CL CYMar 12, 2024

MoralBERT: A Fine-Tuned Language Model for Capturing Moral Values in Social Discussions

Vjosa Preniqi, Iacopo Ghinassi, Julia Ive, Charalampos Saitis, Kyriaki Kalimeri

arXiv:2403.07678v210.423 citationsh-index: 21Has CodeGoodIT

Originality Incremental advance

AI Analysis

This work addresses the need for annotation-free morality learning in NLP to understand moral narratives in controversial social debates, though it is incremental as it builds on existing models and theories.

The paper tackles the problem of capturing moral values in social discussions by introducing MoralBERT, a fine-tuned language model based on Moral Foundations Theory, which achieves an average F1 score 11-32% higher than baseline methods like lexicon-based approaches and GPT-4 for in-domain inference.

Moral values play a fundamental role in how we evaluate information, make decisions, and form judgements around important social issues. Controversial topics, including vaccination, abortion, racism, and sexual orientation, often elicit opinions and attitudes that are not solely based on evidence but rather reflect moral worldviews. Recent advances in Natural Language Processing (NLP) show that moral values can be gauged in human-generated textual content. Building on the Moral Foundations Theory (MFT), this paper introduces MoralBERT, a range of language representation models fine-tuned to capture moral sentiment in social discourse. We describe a framework for both aggregated and domain-adversarial training on multiple heterogeneous MFT human-annotated datasets sourced from Twitter (now X), Reddit, and Facebook that broaden textual content diversity in terms of social media audience interests, content presentation and style, and spreading patterns. We show that the proposed framework achieves an average F1 score that is between 11% and 32% higher than lexicon-based approaches, Word2Vec embeddings, and zero-shot classification with large language models such as GPT-4 for in-domain inference. Domain-adversarial training yields better out-of domain predictions than aggregate training while achieving comparable performance to zero-shot learning. Our approach contributes to annotation-free and effective morality learning, and provides useful insights towards a more comprehensive understanding of moral narratives in controversial social debates using NLP.

View on arXiv PDF Code

Similar