On The Relevance Of The Differences Between HRTF Measurement Setups For Machine Learning
This work addresses a data scarcity issue for researchers and engineers in spatial audio, but it is incremental as it focuses on evaluating existing differences rather than proposing a new solution.
The paper tackles the problem of limited head-related transfer function (HRTF) data for machine learning by investigating the impact of combining datasets measured under different conditions, finding that certain dataset differences significantly affect model performance.
As spatial audio is enjoying a surge in popularity, data-driven machine learning techniques that have been proven successful in other domains are increasingly used to process head-related transfer function measurements. However, these techniques require much data, whereas the existing datasets are ranging from tens to the low hundreds of datapoints. It therefore becomes attractive to combine multiple of these datasets, although they are measured under different conditions. In this paper, we first establish the common ground between a number of datasets, then we investigate potential pitfalls of mixing datasets. We perform a simple experiment to test the relevance of the remaining differences between datasets when applying machine learning techniques. Finally, we pinpoint the most relevant differences.