LGNov 4, 2025

Reinforcement learning based data assimilation for unknown state model

arXiv:2511.02286v11 citationsh-index: 1

Originality Highly original

AI Analysis

This addresses a critical challenge in state estimation for applications where governing equations are unknown, offering a novel approach that bypasses the need for pre-computed training datasets.

The paper tackles the problem of data assimilation for unknown state dynamics without requiring noise-free ground-truth state sequences, proposing a reinforcement learning method that learns a surrogate transition model directly from noisy observations and achieves superior accuracy and robustness in high-dimensional settings.

Data assimilation (DA) has increasingly emerged as a critical tool for state estimation across a wide range of applications. It is signiffcantly challenging when the governing equations of the underlying dynamics are unknown. To this end, various machine learning approaches have been employed to construct a surrogate state transition model in a supervised learning framework, which relies on pre-computed training datasets. However, it is often infeasible to obtain noise-free ground-truth state sequences in practice. To address this challenge, we propose a novel method that integrates reinforcement learning with ensemble-based Bayesian ffltering methods, enabling the learning of surrogate state transition model for unknown dynamics directly from noisy observations, without using true state trajectories. Speciffcally, we treat the process for computing maximum likelihood estimation of surrogate model parameters as a sequential decision-making problem, which can be formulated as a discretetime Markov decision process (MDP). Under this formulation, learning the surrogate transition model is equivalent to ffnding an optimal policy of the MDP, which can be effectively addressed using reinforcement learning techniques. Once the model is trained offfine, state estimation can be performed in the online stage using ffltering methods based on the learned dynamics. The proposed framework accommodates a wide range of observation scenarios, including nonlinear and partially observed measurement models. A few numerical examples demonstrate that the proposed method achieves superior accuracy and robustness in high-dimensional settings.

View on arXiv PDF

Similar