One or Two Components? The Scattering Transform Answers
This work addresses biologically plausible auditory modeling for signal processing applications, but it appears incremental as it extends existing frameworks to more components.
The paper tackled the problem of modeling machine listening by analyzing how wavelet scattering networks represent multicomponent stationary signals, showing that renormalizing second-order nodes provides a criterion for psychoacoustic interference and proving that scattering depth grows logarithmically with bandwidth for Fourier series.
With the aim of constructing a biologically plausible model of machine listening, we study the representation of a multicomponent stationary signal by a wavelet scattering network. First, we show that renormalizing second-order nodes by their first-order parents gives a simple numerical criterion to assess whether two neighboring components will interfere psychoacoustically. Secondly, we run a manifold learning algorithm (Isomap) on scattering coefficients to visualize the similarity space underlying parametric additive synthesis. Thirdly, we generalize the "one or two components" framework to three sine waves or more, and prove that the effective scattering depth of a Fourier series grows in logarithmic proportion to its bandwidth.