NECGMMSDAug 26, 2016

Applying Topological Persistence in Convolutional Neural Network for Music Audio Signals

arXiv:1608.07373v139 citations
Originality Highly original
AI Analysis

This work addresses the lack of shape exploitation in neural networks for audio processing, offering a novel integration that improves multi-label classification tasks like music tagging.

The paper tackled the problem of incorporating shape information into deep neural networks for audio signals by embedding persistence landscapes into a convolutional neural network, resulting in a persistent convolutional neural network (PCNN) that significantly outperformed state-of-the-art models in prediction accuracy on automatic music tagging.

Recent years have witnessed an increased interest in the application of persistent homology, a topological tool for data analysis, to machine learning problems. Persistent homology is known for its ability to numerically characterize the shapes of spaces induced by features or functions. On the other hand, deep neural networks have been shown effective in various tasks. To our best knowledge, however, existing neural network models seldom exploit shape information. In this paper, we investigate a way to use persistent homology in the framework of deep neural networks. Specifically, we propose to embed the so-called "persistence landscape," a rather new topological summary for data, into a convolutional neural network (CNN) for dealing with audio signals. Our evaluation on automatic music tagging, a multi-label classification task, shows that the resulting persistent convolutional neural network (PCNN) model can perform significantly better than state-of-the-art models in prediction accuracy. We also discuss the intuition behind the design of the proposed model, and offer insights into the features that it learns.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes