Deep Active Speech Cancellation with Mamba-Masking Network
This addresses the problem of speech cancellation in rapidly changing, high-frequency conditions for applications like noise reduction, but appears incremental as it builds on existing Active Noise Cancellation methods.
The paper tackles the problem of Active Speech Cancellation (ASC) by developing a deep learning network that cancels both noise and speech signals, achieving up to 7.2dB improvement in ANC scenarios and 6.2dB in ASC.
We present a novel deep learning network for Active Speech Cancellation (ASC), advancing beyond Active Noise Cancellation (ANC) methods by effectively canceling both noise and speech signals. The proposed Mamba-Masking architecture introduces a masking mechanism that directly interacts with the encoded reference signal, enabling adaptive and precisely aligned anti-signal generation-even under rapidly changing, high-frequency conditions, as commonly found in speech. Complementing this, a multi-band segmentation strategy further improves phase alignment across frequency bands. Additionally, we introduce an optimization-driven loss function that provides near-optimal supervisory signals for anti-signal generation. Experimental results demonstrate substantial performance gains, achieving up to 7.2dB improvement in ANC scenarios and 6.2dB in ASC, significantly outperforming existing methods.