ASSDMay 2

Toward Fair Speech Technologies: A Comprehensive Survey of Bias and Fairness in Speech AI

arXiv:2605.0159791.2h-index: 2
Predicted impact top 7% in AS · last 90 daysOriginality Synthesis-oriented
AI Analysis

For researchers and practitioners in speech AI, this survey provides a comprehensive, structured overview of fairness issues, but it is a literature review without empirical contributions.

This survey synthesizes over 400 studies on bias and fairness in speech AI, proposing a unified framework linking formal fairness definitions to evaluation, diagnosis, and mitigation. It identifies speech-specific bias sources and organizes the field into three paradigms: Robustness, Representation, and Governance.

Speech technologies are deployed in high-stakes settings, yet fairness concerns remain fragmented across tasks and disciplines. Existing surveys either adopt a general machine-learning perspective that overlooks speech-specific properties or focus on a single task, missing failure patterns shared across the speech domain. Synthesizing over 400 studies spanning generation and perception tasks and emerging speech-language models, this survey presents a unified framework that links formal fairness definitions to evaluation, diagnosis, and mitigation. We formalize seven fairness definitions adapted to the speech modality and organize the field's conceptual evolution through three paradigms: Robustness, Representation, and Governance. We then ground evaluation metrics in the mathematical cores of these definitions and offer a decision tree for metric selection. We diagnose bias sources along the speech processing pipeline, surfacing speech-specific mechanisms such as channel bias as a demographic proxy and annotation subjectivity in emotion labels. We systematize mitigation strategies across four intervention stages, mapping each to the diagnosed sources. Finally, we identify open challenges and propose directions for future research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes