SDLGASJan 20, 2025

A2SB: Audio-to-Audio Schrodinger Bridges

arXiv:2501.11311v27 citationsh-index: 29
Originality Incremental advance
AI Analysis

This addresses audio degradation issues for music applications, but it is incremental as it builds on existing restoration techniques with specific improvements.

The paper tackled audio restoration for high-resolution music by introducing A2SB, a model that performs bandwidth extension and inpainting at 44.1kHz, achieving state-of-the-art quality on out-of-distribution test sets.

Real-world audio is often degraded by numerous factors. This work presents an audio restoration model tailored for high-res music at 44.1kHz. Our model, Audio-to-Audio Schrödinger Bridges (A2SB), is capable of both bandwidth extension (predicting high-frequency components) and inpainting (re-generating missing segments). Critically, A2SB is end-to-end requiring no vocoder to predict waveform outputs, able to restore hour-long audio inputs, and trained on permissively licensed music data. A2SB is capable of achieving state-of-the-art band-width extension and inpainting quality on several out-of-distribution music test sets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes