Neural-Enhanced Dynamic Range Compression Inversion: A Hybrid Approach for Restoring Audio Dynamics
This work addresses the challenge of restoring audio dynamics for applications in music production and speech processing, representing an incremental improvement over existing methods.
The paper tackled the problem of inverting Dynamic Range Compression (DRC) to restore original audio dynamics by introducing a hybrid method combining model-based inversion with neural networks for robust parameter estimation, achieving superior performance over state-of-the-art techniques on music and speech datasets.
Dynamic Range Compression (DRC) is a widely used audio effect that adjusts signal dynamics for applications in music production, broadcasting, and speech processing. Inverting DRC is of broad importance for restoring the original dynamics, enabling remixing, and enhancing the overall audio quality. Existing DRC inversion methods either overlook key parameters or rely on precise parameter values, which can be challenging to estimate accurately. To address this limitation, we introduce a hybrid approach that combines model-based DRC inversion with neural networks to achieve robust DRC parameter estimation and audio restoration simultaneously. Our method uses tailored neural network architectures (classification and regression), which are then integrated into a model-based inversion framework to reconstruct the original signal. Experimental evaluations on various music and speech datasets confirm the effectiveness and robustness of our approach, outperforming several state-of-the-art techniques.