First Order Ambisonics Domain Spatial Augmentation for DNN-based Direction of Arrival Estimation
This addresses data scarcity for researchers and practitioners in audio signal processing, though it is incremental as it builds on existing augmentation principles.
The paper tackles the problem of limited data for training neural networks in Direction of Arrival (DOA) estimation by proposing a novel data augmentation method based on First Order Ambisonics transformations, which improves DOA error by around 40% in experiments.
In this paper, we propose a novel data augmentation method for training neural networks for Direction of Arrival (DOA) estimation. This method focuses on expanding the representation of the DOA subspace of a dataset. Given some input data, it applies a transformation to it in order to change its DOA information and simulate new potentially unseen one. Such transformation, in general, is a combination of a rotation and a reflection. It is possible to apply such transformation due to a well-known property of First Order Ambisonics (FOA). The same transformation is applied also to the labels, in order to maintain consistency between input data and target labels. Three methods with different level of generality are proposed for applying this augmentation principle. Experiments are conducted on two different DOA networks. Results of both experiments demonstrate the effectiveness of the novel augmentation strategy by improving the DOA error by around 40%.