Urban Sound Classification : striving towards a fair comparison
It addresses the problem of fair comparison and reproducibility in urban sound classification for researchers, offering an incremental framework to standardize evaluations.
The paper tackles urban sound classification for noise pollution monitoring, presenting a winning solution from DCASE 2020 task 5 that achieves a macro-AUPRC of 0.82/0.62 for coarse/fine classification on validation and accuracies of 89.7% and 85.41% on ESC-50 and US8k datasets.
Urban sound classification has been achieving remarkable progress and is still an active research area in audio pattern recognition. In particular, it allows to monitor the noise pollution, which becomes a growing concern for large cities. The contribution of this paper is two-fold. First, we present our DCASE 2020 task 5 winning solution which aims at helping the monitoring of urban noise pollution. It achieves a macro-AUPRC of 0.82 / 0.62 for the coarse / fine classification on validation set. Moreover, it reaches accuracies of 89.7% and 85.41% respectively on ESC-50 and US8k datasets. Second, it is not easy to find a fair comparison and to reproduce the performance of existing models. Sometimes authors copy-pasting the results of the original papers which is not helping reproducibility. As a result, we provide a fair comparison by using the same input representation, metrics and optimizer to assess performances. We preserve data augmentation used by the original papers. We hope this framework could help evaluate new architectures in this field. For better reproducibility, the code is available on our GitHub repository.