Machine Learning Techniques to Construct Patched Analog Ensembles for Data Assimilation
This work addresses scalability issues in data assimilation for complex dynamical models, offering an incremental improvement over existing methods.
The paper tackles the scalability of constructed analog ensemble optimal interpolation (cAnEnOI) for data assimilation by proposing a patching scheme to divide spatial domains into chunks, enabling training of generative models and parallelism; testing on a 1D toy model shows the patched method outperforms original cAnEnOI and ensemble square root filter, with a trade-off between patch size, training accuracy, and assimilation performance.
Using generative models from the machine learning literature to create artificial ensemble members for use within data assimilation schemes has been introduced in [Grooms QJRMS, 2020] as constructed analog ensemble optimal interpolation (cAnEnOI). Specifically, we study general and variational autoencoders for the machine learning component of this method, and combine the ideas of constructed analogs and ensemble optimal interpolation in the data assimilation piece. To extend the scalability of cAnEnOI for use in data assimilation on complex dynamical models, we propose using patching schemes to divide the global spatial domain into digestible chunks. Using patches makes training the generative models possible and has the added benefit of being able to exploit parallelism during the generative step. Testing this new algorithm on a 1D toy model, we find that larger patch sizes make it harder to train an accurate generative model (i.e. a model whose reconstruction error is small), while conversely the data assimilation performance improves at larger patch sizes. There is thus a sweet spot where the patch size is large enough to enable good data assimilation performance, but not so large that it becomes difficult to train an accurate generative model. In our tests the new patched cAnEnOI method outperforms the original (unpatched) cAnEnOI, as well as the ensemble square root filter results from [Grooms QJRMS, 2020].