SDLGASJul 12, 2022

EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use

arXiv:2207.05508v120 citationsh-index: 20
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for audio classification researchers, as it reduces computational cost without solving the core problem of learnable frontends.

The paper tackled the computational expense of LEAF, a learnable audio frontend, by proposing EfficientLEAF with modified convolution kernels and operations, achieving similar accuracy at 3% of the cost, but both methods failed to consistently outperform a fixed mel filterbank in audio classification tasks.

In audio classification, differentiable auditory filterbanks with few parameters cover the middle ground between hard-coded spectrograms and raw audio. LEAF (arXiv:2101.08596), a Gabor-based filterbank combined with Per-Channel Energy Normalization (PCEN), has shown promising results, but is computationally expensive. With inhomogeneous convolution kernel sizes and strides, and by replacing PCEN with better parallelizable operations, we can reach similar results more efficiently. In experiments on six audio classification tasks, our frontend matches the accuracy of LEAF at 3% of the cost, but both fail to consistently outperform a fixed mel filterbank. The quest for learnable audio frontends is not solved.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes