CLNov 10, 2025

LLMs vs. Traditional Sentiment Tools in Psychology: An Evaluation on Belgian-Dutch Narratives

arXiv:2511.07641v14.92 citationsh-index: 1

Originality Synthesis-oriented

AI Analysis

This challenges assumptions about LLM superiority in sentiment analysis for low-resource languages like Flemish, highlighting the need for culturally tailored evaluation frameworks.

The study evaluated Dutch-specific Large Language Models against traditional lexicon-based tools for predicting emotional valence in Flemish narratives, finding that the LLMs underperformed compared to traditional methods, with Pattern showing superior performance.

Understanding emotional nuances in everyday language is crucial for computational linguistics and emotion research. While traditional lexicon-based tools like LIWC and Pattern have served as foundational instruments, Large Language Models (LLMs) promise enhanced context understanding. We evaluated three Dutch-specific LLMs (ChocoLlama-8B-Instruct, Reynaerde-7B-chat, and GEITje-7B-ultra) against LIWC and Pattern for valence prediction in Flemish, a low-resource language variant. Our dataset comprised approximately 25000 spontaneous textual responses from 102 Dutch-speaking participants, each providing narratives about their current experiences with self-assessed valence ratings (-50 to +50). Surprisingly, despite architectural advancements, the Dutch-tuned LLMs underperformed compared to traditional methods, with Pattern showing superior performance. These findings challenge assumptions about LLM superiority in sentiment analysis tasks and highlight the complexity of capturing emotional valence in spontaneous, real-world narratives. Our results underscore the need for developing culturally and linguistically tailored evaluation frameworks for low-resource language variants, while questioning whether current LLM fine-tuning approaches adequately address the nuanced emotional expressions found in everyday language use.

View on arXiv PDF

Similar