Wearable SELD dataset: Dataset for sound event localization and detection using wearable devices around head
This dataset addresses the problem of limited application scope for SELD in wearable contexts, such as for pedestrians, by enabling research with flexible microphone geometries, though it is incremental as it builds on existing SELD methods.
The authors tackled the lack of datasets for sound event localization and detection (SELD) with wearable microphone arrays by proposing the Wearable SELD dataset, recorded using 24 microphones on a head and torso simulator with accessories like glasses and earphones, and they provided experimental results with SELDNet to analyze microphone configuration effects.
Sound event localization and detection (SELD) is a combined task of identifying the sound event and its direction. Deep neural networks (DNNs) are utilized to associate them with the sound signals observed by a microphone array. Although ambisonic microphones are popular in the literature of SELD, they might limit the range of applications due to their predetermined geometry. Some applications (including those for pedestrians that perform SELD while walking) require a wearable microphone array whose geometry can be designed to suit the task. In this paper, for the development of such a wearable SELD, we propose a dataset named Wearable SELD dataset. It consists of data recorded by 24 microphones placed on a head and torso simulators (HATS) with some accessories mimicking wearable devices (glasses, earphones, and headphones). We also provide experimental results of SELD using the proposed dataset and SELDNet to investigate the effect of microphone configuration.