Stephen S. -T. Yau

AI
h-index7
4papers
15citations
Novelty53%
AI Score46

4 Papers

NAMay 25
Tensor train methods for high-dimensional nonlinear filtering problems with correlated noise

Yuhua Meng, Stephen S. -T. Yau, Zhiwen Zhang

Nonlinear filtering with correlated noise leads to a Duncan-Mortensen-Zakai (DMZ) equation in the form of a stochastic partial differential equation (SPDE). Unlike the independent noise case, the presence of correlation prevents the classical invertible transformation that reduces the DMZ equation to a deterministic partial differential equation, requiring a direct numerical treatment of the SPDE. This paper develops a tensor train (TT) based framework for solving medium- to high-dimensional DMZ equations with correlated noise. Spatial discretization transforms the SPDE into a high-dimensional stochastic differential system, which is efficiently compressed using TT approximation. A semi-implicit Milstein scheme is employed for temporal integration to ensure stability and accuracy. Under suitable regularity assumptions, we establish a convergence analysis of the proposed method. In particular, the spatial error is controlled by both the mesh size and the prescribed TT approximation accuracy. In the temporal direction, the convergence is proved by estimating stochastic integrals involving drifted observations, without invoking a change-of-measure argument. Numerical experiments demonstrate that the proposed method achieves stable and accurate performance for cubic sensor problems. In challenging multi-modal settings, where particle filter and extended Kalman filter deteriorate, the proposed method maintains accuracy and effectively captures the posterior distribution.

SYMar 15
Nonlinear Bayesian Filtering with Natural Gradient Gaussian Approximation

Wenhan Cao, Tianyi Zhang, Zeju Sun et al.

Practical Bayes filters often assume the state distribution of each time step to be Gaussian for computational tractability, resulting in the so-called Gaussian filters. When facing nonlinear systems, Gaussian filters such as extended Kalman filter (EKF) or unscented Kalman filter (UKF) typically rely on certain linearization techniques, which can introduce large estimation errors. To address this issue, this paper reconstructs the prediction and update steps of Gaussian filtering as solutions to two distinct optimization problems, whose optimal conditions are found to have analytical forms from Stein's lemma. It is observed that the stationary point for the prediction step requires calculating the first two moments of the prior distribution, which is equivalent to that step in existing moment-matching filters. In the update step, instead of linearizing the model to approximate the stationary points, we propose an iterative approach to directly minimize the update step's objective to avoid linearization errors. For the purpose of performing the steepest descent on the Gaussian manifold, we derive its natural gradient that leverages Fisher information matrix to adjust the gradient direction, accounting for the curvature of the parameter space. Combining this update step with moment matching in the prediction step, we introduce a new iterative filter for nonlinear systems called \textit{N}atural Gr\textit{a}dient Gaussia\textit{n} Appr\textit{o}ximation filter, or NANO filter for short. We prove that NANO filter locally converges to the optimal Gaussian approximation at each time step. Furthermore, the estimation error is proven exponentially bounded for nearly linear measurement equation and low noise levels through constructing a supermartingale-like property across consecutive time steps.

AIMay 1
AEM: Adaptive Entropy Modulation for Multi-Turn Agentic Reinforcement Learning

Haotian Zhao, Yuxin Zhang, Songlin Zhou et al.

Reinforcement learning (RL) has significantly advanced the ability of large language model (LLM) agents to interact with environments and solve multi-turn tasks. Yet effective training remains challenging, as sparse, outcome-only rewards make it difficult to assign credit to individual steps in an agent's action trajectory. A common remedy is to introduce dense intermediate supervision, such as process reward models or auxiliary self-supervised signals, but this increases supervision and tuning complexity and often generalizes poorly across tasks and domains. This paper presents AEM, a supervision-free credit assignment method that adaptively modulates entropy dynamics during RL training to achieve a more effective exploration-exploitation trade-off. Theoretically, we elevate entropy analysis from the token level to the response level to reduce token sampling variance and show that entropy drift under natural gradients is intrinsically governed by the product of the advantage and the relative response surprisal. Specifically, we derive a practical proxy to reshape training dynamics, enabling a natural transition from exploration to exploitation. Extensive experiments across various benchmarks and models ranging from 1.5B to 32B parameters demonstrate the effectiveness of AEM, including a notable 1.4 percent gain when integrated into a state-of-the-art baseline on the highly challenging SWE-bench-Verified benchmark.

MLMar 30, 2024
Convolutional Bayesian Filtering

Wenhan Cao, Shiqi Liu, Chang Liu et al.

Bayesian filtering serves as the mainstream framework of state estimation in dynamic systems. Its standard version utilizes total probability rule and Bayes' law alternatively, where how to define and compute conditional probability is critical to state distribution inference. Previously, the conditional probability is assumed to be exactly known, which represents a measure of the occurrence probability of one event, given the second event. In this paper, we find that by adding an additional event that stipulates an inequality condition, we can transform the conditional probability into a special integration that is analogous to convolution. Based on this transformation, we show that both transition probability and output probability can be generalized to convolutional forms, resulting in a more general filtering framework that we call convolutional Bayesian filtering. This new framework encompasses standard Bayesian filtering as a special case when the distance metric of the inequality condition is selected as Dirac delta function. It also allows for a more nuanced consideration of model mismatch by choosing different types of inequality conditions. For instance, when the distance metric is defined in a distributional sense, the transition probability and output probability can be approximated by simply rescaling them into fractional powers. Under this framework, a robust version of Kalman filter can be constructed by only altering the noise covariance matrix, while maintaining the conjugate nature of Gaussian distributions. Finally, we exemplify the effectiveness of our approach by reshaping classic filtering algorithms into convolutional versions, including Kalman filter, extended Kalman filter, unscented Kalman filter and particle filter.