An Improved Measure of Musical Noise Based on Spectral Kurtosis
This work addresses the issue of accurately quantifying musical noise for researchers and practitioners in audio coding, speech enhancement, and source separation, representing an incremental improvement over existing methods.
The paper tackled the problem of measuring musical noise artifacts in audio processing by proposing an improved measure based on spectral kurtosis, which outperformed baseline measures in correlation with human perception tests, achieving nearly as good correlation as the computationally expensive APS while being much cheaper to compute.
Audio processing methods operating on a time-frequency representation of the signal can introduce unpleasant sounding artifacts known as musical noise. These artifacts are observed in the context of audio coding, speech enhancement, and source separation. The change in kurtosis of the power spectrum introduced during the processing was shown to correlate with the human perception of musical noise in the context of speech enhancement, leading to the proposal of measures based on it. These baseline measures are here shown to correlate with human perception only in a limited manner. As ground truth for the human perception, the results from two listening tests are considered: one involving audio coding and one involving source separation. Simple but effective perceptually motivated improvements are proposed and the resulting new measure is shown to clearly outperform the baselines in terms of correlation with the results of both listening tests. Moreover, with respect to the listening test on musical noise in audio coding, the exhibited correlation is nearly as good as the one exhibited by the Artifact-related Perceptual Score (APS), which was found to be the best objective measure for this task. The APS is however computationally very expensive. The proposed measure is easily computed, requiring only a fraction of the computational cost of the APS.