SDNEJun 26, 2017

Between Homomorphic Signal Processing and Deep Neural Networks: Constructing Deep Algorithms for Polyphonic Music Transcription

arXiv:1706.08231v12 citations
Originality Incremental advance
AI Analysis

It provides an incremental theoretical explanation for DNNs in audio processing, potentially aiding researchers in signal processing and music transcription.

This paper tackles the problem of understanding deep neural networks (DNNs) by applying homomorphic signal processing to multi-pitch estimation, demonstrating equivalence between a generalized cepstrum and DNNs and proposing a new feature that outperforms the one-layer spectrum in this task.

This paper presents a new approach in understanding how deep neural networks (DNNs) work by applying homomorphic signal processing techniques. Focusing on the task of multi-pitch estimation (MPE), this paper demonstrates the equivalence relation between a generalized cepstrum and a DNN in terms of their structures and functionality. Such an equivalence relation, together with pitch perception theories and the recently established rectified-correlations-on-a-sphere (RECOS) filter analysis, provide an alternative way in explaining the role of the nonlinear activation function and the multi-layer structure, both of which exist in a cepstrum and a DNN. To validate the efficacy of this new approach, a new feature designed in the same fashion is proposed for pitch salience function. The new feature outperforms the one-layer spectrum in the MPE task and, as predicted, it addresses the issue of the missing fundamental effect and also achieves better robustness to noise.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes