MicroEvoEval: A Systematic Evaluation Framework for Image-Based Microstructure Evolution Prediction
This addresses a gap in materials science by providing a systematic benchmark for evaluating deep learning models, which is incremental as it builds on existing methods but standardizes and extends evaluation criteria.
The paper tackles the lack of standardized benchmarks for deep learning models predicting microstructure evolution in materials science by introducing MicroEvoEval, a comprehensive evaluation framework, and finds that modern architectures like VMamba achieve superior long-term stability, physical fidelity, and an order-of-magnitude greater computational efficiency compared to other models.
Simulating microstructure evolution (MicroEvo) is vital for materials design but demands high numerical accuracy, efficiency, and physical fidelity. Although recent studies on deep learning (DL) offer a promising alternative to traditional solvers, the field lacks standardized benchmarks. Existing studies are flawed due to a lack of comparing specialized MicroEvo DL models with state-of-the-art spatio-temporal architectures, an overemphasis on numerical accuracy over physical fidelity, and a failure to analyze error propagation over time. To address these gaps, we introduce MicroEvoEval, the first comprehensive benchmark for image-based microstructure evolution prediction. We evaluate 14 models, encompassing both domain-specific and general-purpose architectures, across four representative MicroEvo tasks with datasets specifically structured for both short- and long-term assessment. Our multi-faceted evaluation framework goes beyond numerical accuracy and computational cost, incorporating a curated set of structure-preserving metrics to assess physical fidelity. Our extensive evaluations yield several key insights. Notably, we find that modern architectures (e.g., VMamba), not only achieve superior long-term stability and physical fidelity but also operate with an order-of-magnitude greater computational efficiency. The results highlight the necessity of holistic evaluation and identify these modern architectures as a highly promising direction for developing efficient and reliable surrogate models in data-driven materials science.