SDLGApr 24, 2025

A Machine Learning Approach for Denoising and Upsampling HRTFs

arXiv:2504.17586v13 citationsh-index: 3EUSIPCO
Originality Incremental advance
AI Analysis

This work addresses the challenge of time-consuming and noise-sensitive HRTF measurement for virtual immersive audio, offering a practical solution for audio engineers and VR developers, though it is incremental as it builds on existing machine learning techniques.

The paper tackles the problem of generating high-quality Head-Related Transfer Functions (HRTFs) from sparse, noisy measurements by proposing a method that combines denoising and upsampling, achieving a log-spectral distortion error of 5.41 dB and a cosine similarity loss of 0.0070.

The demand for realistic virtual immersive audio continues to grow, with Head-Related Transfer Functions (HRTFs) playing a key role. HRTFs capture how sound reaches our ears, reflecting unique anatomical features and enhancing spatial perception. It has been shown that personalized HRTFs improve localization accuracy, but their measurement remains time-consuming and requires a noise-free environment. Although machine learning has been shown to reduce the required measurement points and, thus, the measurement time, a controlled environment is still necessary. This paper proposes a method to address this constraint by presenting a novel technique that can upsample sparse, noisy HRTF measurements. The proposed approach combines an HRTF Denoisy U-Net for denoising and an Autoencoding Generative Adversarial Network (AE-GAN) for upsampling from three measurement points. The proposed method achieves a log-spectral distortion (LSD) error of 5.41 dB and a cosine similarity loss of 0.0070, demonstrating the method's effectiveness in HRTF upsampling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes