CVAICLApr 11

Demographic and Linguistic Bias Evaluation in Omnimodal Language Models

arXiv:2604.1001456.9h-index: 1
Predicted impact top 58% in CV · last 90 daysOriginality Synthesis-oriented
AI Analysis

For developers and deployers of multimodal AI systems, this study highlights the need to evaluate fairness across all modalities, as audio tasks exhibit severe biases that could harm underrepresented groups.

This paper evaluates demographic and linguistic biases in four omnimodal language models across tasks like demographic estimation and speech transcription, finding that audio tasks show significantly lower performance and larger biases (e.g., accuracy differences across age, gender, and language) compared to image/video tasks.

This paper provides a comprehensive evaluation of demographic and linguistic biases in omnimodal language models that process text, images, audio, and video within a single framework. Although these models are being widely deployed, their performance across different demographic groups and modalities is not well studied. Four omnimodal models are evaluated on tasks that include demographic attribute estimation, identity verification, activity recognition, multilingual speech transcription, and language identification. Accuracy differences are measured across age, gender, skin tone, language, and country of origin. The results show that image and video understanding tasks generally exhibit better performance with smaller demographic disparities. In contrast, audio understanding tasks exhibit significantly lower performance and substantial bias, including large accuracy differences across age groups, genders, and languages, and frequent prediction collapse toward narrow categories. These findings highlight the importance of evaluating fairness across all supported modalities as omnimodal language models are increasingly used in real-world applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes