ASLGSDMar 24

Autoregressive Guidance of Deep Spatially Selective Filters using Bayesian Tracking for Efficient Extraction of Moving Speakers

arXiv:2603.2372337.6h-index: 2
Predicted impact top 86% in AS · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the challenge of efficient real-time speech enhancement in dynamic acoustic scenarios for applications like audio processing and communication systems, representing an incremental improvement over existing methods.

The paper tackled the problem of maintaining high-quality speech enhancement for moving speakers using deep spatially selective filters by developing autoregressive Bayesian tracking algorithms that incorporate enhanced signals to improve tracking accuracy, resulting in superior enhancement with negligible computational overhead.

Deep spatially selective filters achieve high-quality enhancement with real-time capable architectures for stationary speakers of known directions. To retain this level of performance in dynamic scenarios when only the speakers' initial directions are given, accurate, yet computationally lightweight tracking algorithms become necessary. Assuming a frame-wise causal processing style, temporal feedback allows for leveraging the enhanced speech signal to improve tracking performance. In this work, we investigate strategies to incorporate the enhanced signal into lightweight tracking algorithms and autoregressively guide deep spatial filters. Our proposed Bayesian tracking algorithms are compatible with arbitrary deep spatial filters. To increase the realism of simulated trajectories during development and evaluation, we propose and publish a novel dataset based on the social force model. Results validate that the autoregressive incorporation significantly improves the accuracy of our Bayesian trackers, resulting in superior enhancement with none or only negligibly increased computational overhead. Real-world recordings complement these findings and demonstrate the generalizability of our methods to unseen, challenging acoustic conditions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes