Quaternion Convolutional Neural Networks for Detection and Localization of 3D Sound Events
This addresses spatial sound analysis for audio processing applications, but it is incremental as it applies an existing quaternion method to a specific domain.
The paper tackled 3D sound event detection and localization by processing spherical harmonic components from ambisonic microphones with a quaternion convolutional neural network, improving accuracy by exploiting signal correlations.
Learning from data in the quaternion domain enables us to exploit internal dependencies of 4D signals and treating them as a single entity. One of the models that perfectly suits with quaternion-valued data processing is represented by 3D acoustic signals in their spherical harmonics decomposition. In this paper, we address the problem of localizing and detecting sound events in the spatial sound field by using quaternion-valued data processing. In particular, we consider the spherical harmonic components of the signals captured by a first-order ambisonic microphone and process them by using a quaternion convolutional neural network. Experimental results show that the proposed approach exploits the correlated nature of the ambisonic signals, thus improving accuracy results in 3D sound event detection and localization.