SLoClas: A Database for Joint Sound Localization and Classification
This provides a new dataset for researchers in audio processing to study sound localization and classification, but it is incremental as it builds on existing work in the field.
The authors introduced the SLoClas database for joint sound localization and classification, containing 23.27 hours of data with 10 sound classes and varied directions, and proposed a baseline method achieving 95.21% localization and 80.01% classification accuracy.
In this work, we present the development of a new database, namely Sound Localization and Classification (SLoClas) corpus, for studying and analyzing sound localization and classification. The corpus contains a total of 23.27 hours of data recorded using a 4-channel microphone array. 10 classes of sounds are played over a loudspeaker at 1.5 meters distance from the array by varying the Direction-of-Arrival (DoA) from 1 degree to 360 degree at an interval of 5 degree. To facilitate the study of noise robustness, 6 types of outdoor noise are recorded at 4 DoAs, using the same devices. Moreover, we propose a baseline method, namely Sound Localization and Classification Network (SLCnet) and present the experimental results and analysis conducted on the collected SLoClas database. We achieve the accuracy of 95.21% and 80.01% for sound localization and classification, respectively. We publicly release this database and the source code for research purpose.