CLAIMay 22, 2025

University of Indonesia at SemEval-2025 Task 11: Evaluating State-of-the-Art Encoders for Multi-Label Emotion Detection

arXiv:2505.16460v11 citationsh-index: 12
Originality Synthesis-oriented
AI Analysis

This work addresses emotion classification for multilingual NLP applications, but it is incremental as it applies existing methods to a new benchmark task.

The paper tackled multi-label emotion detection across 28 languages by comparing fine-tuning and classifier-only training strategies, finding that prompt-based encoders with CatBoost classifiers outperformed fully fine-tuned models, achieving an average F1-macro score of 56.58.

This paper presents our approach for SemEval 2025 Task 11 Track A, focusing on multilabel emotion classification across 28 languages. We explore two main strategies: fully fine-tuning transformer models and classifier-only training, evaluating different settings such as fine-tuning strategies, model architectures, loss functions, encoders, and classifiers. Our findings suggest that training a classifier on top of prompt-based encoders such as mE5 and BGE yields significantly better results than fully fine-tuning XLMR and mBERT. Our best-performing model on the final leaderboard is an ensemble combining multiple BGE models, where CatBoost serves as the classifier, with different configurations. This ensemble achieves an average F1-macro score of 56.58 across all languages.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes