ASAILGSep 25, 2025

Enhanced Generative Machine Listener

arXiv:2509.21463v1h-index: 9
Originality Incremental advance
AI Analysis

This provides an automated framework for perceptual audio quality evaluation, accelerating research in audio coding technologies.

The researchers tackled the problem of predicting subjective audio quality (MUSHRA scores) by developing GMLv2, a reference-based model that outperformed existing metrics like PEAQ and ViSQOL in correlation and reliability across diverse content and codecs.

We present GMLv2, a reference-based model designed for the prediction of subjective audio quality as measured by MUSHRA scores. GMLv2 introduces a Beta distribution-based loss to model the listener ratings and incorporates additional neural audio coding (NAC) subjective datasets to extend its generalization and applicability. Extensive evaluations on diverse testset demonstrate that proposed GMLv2 consistently outperforms widely used metrics, such as PEAQ and ViSQOL, both in terms of correlation with subjective scores and in reliably predicting these scores across diverse content types and codec configurations. Consequently, GMLv2 offers a scalable and automated framework for perceptual audio quality evaluation, poised to accelerate research and development in modern audio coding technologies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes