A report on sound event detection with different binaural features
This is an incremental improvement for audio processing applications, potentially enhancing detection accuracy in noisy environments.
The paper tackled sound event detection by comparing binaural audio features to single-channel features, finding that binaural features performed equal to or better in error rate on the TUT Sound Events 2017 dataset.
In this paper, we compare the performance of using binaural audio features in place of single-channel features for sound event detection. Three different binaural features are studied and evaluated on the publicly available TUT Sound Events 2017 dataset of length 70 minutes. Sound event detection is performed separately with single-channel and binaural features using stacked convolutional and recurrent neural network and the evaluation is reported using standard metrics of error rate and F-score. The studied binaural features are seen to consistently perform equal to or better than the single-channel features with respect to error rate metric.