Interpreting CFD Surrogates through Sparse Autoencoders
This work addresses the need for explainability and trustworthiness in CFD applications, particularly for safety-critical or regulation-bound settings, though it is incremental as it builds on existing surrogate models and interpretability methods.
The authors tackled the problem of opaque latent representations in learning-based CFD surrogate models by introducing a posthoc interpretability framework using sparse autoencoders, which extracts interpretable latent features aligned with physical phenomena like vorticity.
Learning-based surrogate models have become a practical alternative to high-fidelity CFD solvers, but their latent representations remain opaque and hinder adoption in safety-critical or regulation-bound settings. This work introduces a posthoc interpretability framework for graph-based surrogate models used in computational fluid dynamics (CFD) by leveraging sparse autoencoders (SAEs). By obtaining an overcomplete basis in the node embedding space of a pretrained surrogate, the method extracts a dictionary of interpretable latent features. The approach enables the identification of monosemantic concepts aligned with physical phenomena such as vorticity or flow structures, offering a model-agnostic pathway to enhance explainability and trustworthiness in CFD applications.