SD AI LG NE ASNov 3, 2022

HyperSound: Generating Implicit Neural Representations of Audio Signals with Hypernetworks

Filip Szatkowski, Karol J. Piczak, Przemysław Spurek, Jacek Tabor, Tomasz Trzciński

arXiv:2211.01839v214.520 citationsh-index: 27

Originality Incremental advance

AI Analysis

This addresses the limitation of needing per-sample training for audio INRs, which is incremental as it adapts existing INR techniques to the audio domain.

The authors tackled the problem of generating implicit neural representations (INRs) for audio signals without requiring separate training for each sample, proposing HyperSound, a meta-learning method using hypernetworks that reconstructs sound waves with quality comparable to state-of-the-art models.

Implicit neural representations (INRs) are a rapidly growing research field, which provides alternative ways to represent multimedia signals. Recent applications of INRs include image super-resolution, compression of high-dimensional signals, or 3D rendering. However, these solutions usually focus on visual data, and adapting them to the audio domain is not trivial. Moreover, it requires a separately trained model for every data sample. To address this limitation, we propose HyperSound, a meta-learning method leveraging hypernetworks to produce INRs for audio signals unseen at training time. We show that our approach can reconstruct sound waves with quality comparable to other state-of-the-art models.

View on arXiv PDF

Similar