SDLGMMASApr 10, 2019

Neuralogram: A Deep Neural Network Based Representation for Audio Signals

arXiv:1904.05073v110 citations
Originality Synthesis-oriented
AI Analysis

This addresses the need for better audio signal representation for tasks such as music recommendation and meta-data extraction, but it appears incremental as it builds on existing neural methods without claiming major breakthroughs.

The paper tackles the problem of representing audio signals by proposing Neuralogram, a deep neural network-based dense and compact representation that encapsulates pitch, timbre, rhythm, and other attributes, showing potential for applications like audio understanding and music recommendation.

We propose the Neuralogram -- a deep neural network based representation for understanding audio signals which, as the name suggests, transforms an audio signal to a dense, compact representation based upon embeddings learned via a neural architecture. Through a series of probing signals, we show how our representation can encapsulate pitch, timbre and rhythm-based information, and other attributes. This representation suggests a method for revealing meaningful relationships in arbitrarily long audio signals that are not readily represented by existing algorithms. This has the potential for numerous applications in audio understanding, music recommendation, meta-data extraction to name a few.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes