ASCVLGMMSDFeb 2, 2023

Listen2Scene: Interactive material-aware binaural sound propagation for reconstructed 3D scenes

arXiv:2302.02809v419 citationsh-index: 102
AI Analysis

This addresses realistic sound simulation in virtual environments for VR/AR users, representing a novel method for a known bottleneck.

The paper tackles binaural audio rendering for VR/AR by proposing Listen2Scene, a neural-network-based method that generates acoustic effects for indoor 3D scenes, achieving generation in 0.1 milliseconds and producing more plausible audio than prior methods.

We present an end-to-end binaural audio rendering approach (Listen2Scene) for virtual reality (VR) and augmented reality (AR) applications. We propose a novel neural-network-based binaural sound propagation method to generate acoustic effects for indoor 3D models of real environments. Any clean audio or dry audio can be convolved with the generated acoustic effects to render audio corresponding to the real environment. We propose a graph neural network that uses both the material and the topology information of the 3D scenes and generates a scene latent vector. Moreover, we use a conditional generative adversarial network (CGAN) to generate acoustic effects from the scene latent vector. Our network can handle holes or other artifacts in the reconstructed 3D mesh model. We present an efficient cost function for the generator network to incorporate spatial audio effects. Given the source and the listener position, our learning-based binaural sound propagation approach can generate an acoustic effect in 0.1 milliseconds on an NVIDIA GeForce RTX 2080 Ti GPU. We have evaluated the accuracy of our approach with binaural acoustic effects generated using an interactive geometric sound propagation algorithm and captured real acoustic effects / real-world recordings. We also performed a perceptual evaluation and observed that the audio rendered by our approach is more plausible than audio rendered using prior learning-based and geometric-based sound propagation algorithms. We quantitatively evaluated the accuracy of our approach using statistical acoustic parameters, and energy decay curves. The demo videos, code and dataset are available online (https://anton-jeran.github.io/Listen2Scene/).

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes