Manan Mittal

h-index9

6papers

4citations

Novelty43%

AI Score48

Ranked #54,398 of 201,326 authors (top 27%)#359 in SD (top 20%)

6 Papers

50.3SYMay 5

Adaptive Diagonal Loading for Norm Constrained Beamforming

Manan Mittal, Ryan M. Corey, John R. Buck et al.

Reliable adaptive beamforming is critical for large microphone arrays operating in highly dynamic acoustic environments. In scenarios characterized by fast-moving talkers and interferers, the available sample support for estimating the spatial correlation matrix is often snapshot-deficient. This deficiency, coupled with array imperfections, degrades the White Noise Gain (WNG), leading to severe target signal cancellation. To ensure stable and robust beamforming, we propose a novel adaptive diagonal loading method that guarantees the WNG remains strictly within specified bounds. By leveraging the Kantorovich inequality, we map the desired WNG to a strict upper bound on the condition number of the correlation matrix. Furthermore, we present three estimation techniques for the adaptive loading level, ranging from trace-based bounding to exact eigenvalue decomposition, offering scalable computational complexities of $\mathcal{O}(M)$, $\mathcal{O}(M^2)$, and $\mathcal{O}(M^3)$. Our approach demonstrates highly stable beamforming under fast-changing interference.

15.4SPMay 24

Time Segmented Beamforming via Dynamic Programming: Theory and Implementation

Manan Mittal, Ryan M. Corey, Diego Cuji et al.

In dynamic acoustic environments with time-varying interferers, effective beamforming requires identifying stationary regions over time. The Capon beamformer, a whitened matched filter constrained to maintain unity gain in the desired direction, theoretically relies on the instantaneous ensemble covariance matrix. Practical implementations rely on the batch Capon (or Sample Matrix Inversion), which estimates the sample covariance matrix (SCM) by averaging over a block of snapshots. This practical approach implicitly assumes that the data within the batch window is stationary and can be coherently combined. In non-stationary settings, a batch approach that averages over fixed or excessively long windows fails, as moving interferers smear the SCM and degrade the beamformer's nulling capabilities. To address this, this paper introduces a temporally segmented distortionless response beamformer. Inspired by the segmented least squares method, which fits piecewise polynomials to data while penalizing excessive segmentation to prevent overfitting, the framework extends practical Capon beamforming by incorporating data-driven temporal segmentation. This formulation minimizes output power while dynamically adapting the SCM estimation windows to local stationarity, offering a principled approach to tracking time-varying interferers.

SDSep 25, 2025

Mixture-of-Experts Framework for Field-of-View Enhanced Signal-Dependent Binauralization of Moving Talkers

Manan Mittal, Thomas Deppisch, Joseph Forrer et al.

We propose a novel mixture of experts framework for field-of-view enhancement in binaural signal matching. Our approach enables dynamic spatial audio rendering that adapts to continuous talker motion, allowing users to emphasize or suppress sounds from selected directions while preserving natural binaural cues. Unlike traditional methods that rely on explicit direction-of-arrival estimation or operate in the Ambisonics domain, our signal-dependent framework combines multiple binaural filters in an online manner using implicit localization. This allows for real-time tracking and enhancement of moving sound sources, supporting applications such as speech focus, noise reduction, and world-locked audio in augmented and virtual reality. The method is agnostic to array geometry offering a flexible solution for spatial audio capture and personalized playback in next-generation consumer audio devices.

70.5SPMay 11

Adaptive Diagonal Loading using Krylov Subspaces for Robust Beamforming

Manan Mittal, Ryan M. Corey, John R. Buck et al.

Reliable adaptive beamforming is critical for large microphone arrays operating in highly dynamic acoustic environments. In scenarios characterized by fast-moving talkers and interferers, the available sample support for estimating the spatial correlation matrix is often snapshot-deficient. This deficiency degrades the White Noise Gain (WNG), leading to severe target signal cancellation. To ensure stable and robust beamforming, we previously proposed an adaptive diagonal loading method that leverages the Kantorovich inequality to guarantee the WNG remains strictly within specified bounds. However, accurately determining the smallest necessary loading level requires calculating the extreme eigenvalues of the spatial correlation matrix, a computationally expensive $\mathcal{O}(M^3)$ operation for large arrays. In this paper, we introduce a highly efficient $\mathcal{O}(kM^2)$ estimation technique using Lanczos iterations to build a small Krylov subspace. By projecting the correlation matrix onto a tridiagonal matrix of dimension $k \ll M$, we extract Ritz values that rapidly converge to the exact extreme eigenvalues. Our evaluations demonstrate that this Lanczos-accelerated approach achieves performance identical to exact Eigenvalue Decomposition (EVD), ensuring optimal interference suppression and strict WNG adherence at a fraction of the computational cost.

9.7SDMay 8

Online Segmented Beamforming via Dynamic Programming

Manan Mittal, Ryan M. Corey, Diego Cuji et al.

In dynamic acoustic environments characterized by time-varying interferers and moving sources, effective beamforming requires accurately identifying stationary regions over time. Traditional Capon beamformers rely on the instantaneous ensemble covariance matrix, which is inaccessible in practice. Practical implementations overcome this by estimating the sample covariance matrix (SCM) through averaging over a block of temporal samples. However, in non-stationary settings, a naive batch approach fails. Moving interferers smear the SCM, causing the beamformer to place nulls in outdated locations while failing to track newly active interferers, thereby degrading its nulling capabilities. To address this fundamental limitation, an Online Segmented Beamformer is proposed. This algorithm incorporates data-driven temporal segmentation to causally minimize output power while dynamically adapting the SCM estimation windows to local stationarity. By framing the problem through the lens of dynamic programming, the proposed method tracks abrupt environmental changes and resets covariance estimates in real-time. We validate the performance of this framework in a complex, reverberant simulated acoustic environment and in highly reverberant real world experiments, demonstrating its superiority over fixed-window adaptive methods.

LGJul 5, 2025

Latent FxLMS: Accelerating Active Noise Control with Neural Adaptive Filters

Kanad Sarkar, Austin Lu, Manan Mittal et al.

Filtered-X LMS (FxLMS) is commonly used for active noise control (ANC), wherein the soundfield is minimized at a desired location. Given prior knowledge of the spatial region of the noise or control sources, we could improve FxLMS by adapting along the low-dimensional manifold of possible adaptive filter weights. We train an auto-encoder on the filter coefficients of the steady-state adaptive filter for each primary source location sampled from a given spatial region and constrain the weights of the adaptive filter to be the output of the decoder for a given state of latent variables. Then, we perform updates in the latent space and use the decoder to generate the cancellation filter. We evaluate how various neural network constraints and normalization techniques impact the convergence speed and steady-state mean squared error. Under certain conditions, our Latent FxLMS model converges in fewer steps with comparable steady-state error to the standard FxLMS.