Referenceless Performance Evaluation of Audio Source Separation using Deep Neural Networks
This addresses the limitation of existing evaluation methods for audio source separation in real-world scenarios where reference signals are unavailable.
The paper tackles the problem of evaluating audio source separation without reference signals by proposing a deep neural network that maps processed audio to quality scores, achieving the ability to predict the sources-to-artifacts ratio from a standard toolkit without ground truth.
Current performance evaluation for audio source separation depends on comparing the processed or separated signals with reference signals. Therefore, common performance evaluation toolkits are not applicable to real-world situations where the ground truth audio is unavailable. In this paper, we propose a performance evaluation technique that does not require reference signals in order to assess separation quality. The proposed technique uses a deep neural network (DNN) to map the processed audio into its quality score. Our experiment results show that the DNN is capable of predicting the sources-to-artifacts ratio from the blind source separation evaluation toolkit without the need for reference signals.