Highly-Reverberant Real Environment database: HRRE
This provides a needed evaluation dataset for researchers working on speech recognition in challenging real-world acoustic conditions, but it is incremental as it builds on existing data.
The authors tackled the lack of a dataset for speech recognition in highly-reverberant real environments by creating the Highly-Reverberant Real Environment database (HRRE), which contains 13.4 hours of data recorded under 20 different testing conditions based on reverberation times and distances.
Speech recognition in highly-reverberant real environments remains a major challenge. An evaluation dataset for this task is needed. This report describes the generation of the Highly-Reverberant Real Environment database (HRRE). This database contains 13.4 hours of data recorded in real reverberant environments and consists of 20 different testing conditions which consider a wide range of reverberation times and speaker-to-microphone distances. These evaluation sets were generated by re-recording the clean test set of the Aurora-4 database which corresponds to five loudspeaker-microphone distances in four reverberant conditions.