SD LG NE ASSep 13, 2024

Biomimetic Frontend for Differentiable Audio Processing

Ruolan Leslie Famularo, Dmitry N. Zotkin, Shihab A. Shamma, Ramani Duraiswami

arXiv:2409.08997v12.7h-index: 67Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for more efficient and robust audio processing models, particularly for applications with limited data, though it is incremental by combining existing biomimetic and deep-learning methods.

The authors tackled the problem of deep audio models requiring large datasets and being brittle by making a classical human hearing model differentiable, enabling training on modest data. Their model outperformed black-box approaches in computational efficiency and robustness on tasks like classification and enhancement.

While models in audio and speech processing are becoming deeper and more end-to-end, they as a consequence need expensive training on large data, and are often brittle. We build on a classical model of human hearing and make it differentiable, so that we can combine traditional explainable biomimetic signal processing approaches with deep-learning frameworks. This allows us to arrive at an expressive and explainable model that is easily trained on modest amounts of data. We apply this model to audio processing tasks, including classification and enhancement. Results show that our differentiable model surpasses black-box approaches in terms of computational efficiency and robustness, even with little training data. We also discuss other potential applications.

View on arXiv PDF Code

Similar