On the social bias of speech self-supervised models
It addresses bias in SSL speech models, which can perpetuate injustice by affecting marginalized groups, but is incremental as it builds on existing debiasing methods.
The paper investigates social bias in self-supervised learning (SSL) speech models, revealing that prevalent models acquire biased associations, and finds that techniques like row-pruning and training wider, shallower models can effectively mitigate this bias.
Self-supervised learning (SSL) speech models have achieved remarkable performance in various tasks, yet the biased outcomes, especially affecting marginalized groups, raise significant concerns. Social bias refers to the phenomenon where algorithms potentially amplify disparate properties between social groups present in the data used for training. Bias in SSL models can perpetuate injustice by automating discriminatory patterns and reinforcing inequitable systems. This work reveals that prevalent SSL models inadvertently acquire biased associations. We probe how various factors, such as model architecture, size, and training methodologies, influence the propagation of social bias within these models. Finally, we explore the efficacy of debiasing SSL models through regularization techniques, specifically via model compression. Our findings reveal that employing techniques such as row-pruning and training wider, shallower models can effectively mitigate social bias within SSL model.