ASSDJul 23, 2020

Sound Field Translation and Mixed Source Model for Virtual Applications with Perceptual Validation

arXiv:2007.11795v1
AI Analysis

This work addresses the need for interactive and navigable sound fields in virtual reality applications, offering an incremental improvement over existing sound field translation techniques.

The paper tackles the problem of restricted listener movement in virtual acoustic reproductions by proposing a mixed source model that combines near-field and far-field sources in a sparse virtual environment, resulting in improved source localizability and reduced spectral distortions compared to a planewave benchmark.

Non-interactive and linear experiences like cinema film offer high quality surround sound audio to enhance immersion, however the listener's experience is usually fixed to a single acoustic perspective. With the rise of virtual reality, there is a demand for recording and recreating real-world experiences in a way that allows for the user to interact and move within the reproduction. Conventional sound field translation techniques take a recording and expand it into an equivalent environment of virtual sources. However, the finite sampling of a commercial higher order microphone produces an acoustic sweet-spot in the virtual reproduction. As a result, the technique remains to restrict the listener's navigable region. In this paper, we propose a method for listener translation in an acoustic reproduction that incorporates a mixture of near-field and far-field sources in a sparsely expanded virtual environment. We perceptually validate the method through a Multiple Stimulus with Hidden Reference and Anchor (MUSHRA) experiment. Compared to the planewave benchmark, the proposed method offers both improved source localizability and robustness to spectral distortions at translated positions. A cross-examination with numerical simulations demonstrated that the sparse expansion relaxes the inherent sweet-spot constraint, leading to the improved localizability for sparse environments. Additionally, the proposed method is seen to better reproduce the intensity and binaural room impulse response spectra of near-field environments, further supporting the strong perceptual results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes