LipsAM: Lipschitz-Continuous Amplitude Modifier for Audio Signal Processing and its Application to Plug-and-Play Dereverberation
This work addresses the robustness and stability of audio processing models for applications like speech dereverberation, but it is incremental as it adapts existing Lipschitz continuity concepts to a specific architecture.
The paper tackled the lack of Lipschitz-continuous deep neural networks for audio processing by proposing LipsAM, Lipschitz-continuous variants of the amplitude modifier architecture, and applied them to a Plug-and-Play algorithm for speech dereverberation, demonstrating improved stability in numerical experiments.
The robustness of deep neural networks (DNNs) can be certified through their Lipschitz continuity, which has made the construction of Lipschitz-continuous DNNs an active research field. However, DNNs for audio processing have not been a major focus due to their poor compatibility with existing results. In this paper, we consider the amplitude modifier (AM), a popular architecture for handling audio signals, and propose its Lipschitz-continuous variants, which we refer to as LipsAM. We prove a sufficient condition for an AM to be Lipschitz continuous and propose two architectures as examples of LipsAM. The proposed architectures were applied to a Plug-and-Play algorithm for speech dereverberation, and their improved stability is demonstrated through numerical experiments.