Masked Wavelet Scattering Transform Neural Field for Sound Field Reconstruction

Xinmeng Luan, Samuel A. Verburg, Efren Fernandez-Grande, Gary Scavone

arXiv:2606.0437028.6

AI Analysis

For audio researchers, this provides a new way to impose statistical priors in sound field reconstruction, though it is incremental as it combines existing techniques (WST, neural fields) with a masking strategy.

The paper proposes a neural field framework using masked Wavelet Scattering Transform (WST) as a multi-scale feature extractor for sound field reconstruction, validated on HRTF upsampling. The method outperforms baselines in an ablation study, demonstrating effectiveness under sparse observations.

In this paper, we propose a reconstruction framework that leverages the Wavelet Scattering Transform (WST) as a multi-scale feature extractor to impose statistical priors under sparse observation conditions. The reconstruction problem is formulated as an optimization task and solved using a neural field, with the WST incorporated into the training loss function. As a proof of concept, we validate the proposed method on HRTF upsampling. A masking strategy is applied to the WST coefficients, resulting in a two-phase procedure. The first phase learns a binary mask from a small multi-subject dataset, while the second phase applies the learned mask to the WST coefficients of an individual HRTF to preserve informative statistical structures during reconstruction. Validation against baseline methods, which also serve as an ablation study of the different components of the framework, demonstrates the effectiveness of the proposed approach.

View on arXiv PDF

Similar