CVMay 19, 2025Code
AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool UseYaotian Yang, Yiwen Tang, Yizhe Chen et al.
Machine learning-based interatomic potentials and force fields depend critically on accurate atomic structures, yet such data are scarce due to the limited availability of experimentally resolved crystals. Although atomic-resolution electron microscopy offers a potential source of structural data, converting these images into simulation-ready formats remains labor-intensive and error-prone, creating a bottleneck for model training and validation. We introduce AutoMat, an end-to-end, agent-assisted pipeline that automatically transforms scanning transmission electron microscopy (STEM) images into atomic crystal structures and predicts their physical properties. AutoMat combines pattern-adaptive denoising, physics-guided template retrieval, symmetry-aware atomic reconstruction, fast relaxation and property prediction via MatterSim, and coordinated orchestration across all stages. We propose the first dedicated STEM2Mat-Bench for this task and evaluate performance using lattice RMSD, formation energy MAE, and structure-matching success rate. By orchestrating external tool calls, AutoMat enables a text-only LLM to outperform vision-language models in this domain, achieving closed-loop reasoning throughout the pipeline. In large-scale experiments over 450 structure samples, AutoMat substantially outperforms existing multimodal large language models and tools. These results validate both AutoMat and STEM2Mat-Bench, marking a key step toward bridging microscopy and atomistic simulation in materials science.The code and dataset are publicly available at https://github.com/yyt-2378/AutoMat and https://huggingface.co/datasets/yaotianvector/STEM2Mat.
CLApr 22, 2024
Integrating Chemistry Knowledge in Large Language Models via Prompt EngineeringHongxuan Liu, Haoyu Yin, Zhiyao Luo et al.
This paper presents a study on the integration of domain-specific knowledge in prompt engineering to enhance the performance of large language models (LLMs) in scientific domains. A benchmark dataset is curated to encapsulate the intricate physical-chemical properties of small molecules, their drugability for pharmacology, alongside the functional attributes of enzymes and crystal materials, underscoring the relevance and applicability across biological and chemical domains.The proposed domain-knowledge embedded prompt engineering method outperforms traditional prompt engineering strategies on various metrics, including capability, accuracy, F1 score, and hallucination drop. The effectiveness of the method is demonstrated through case studies on complex materials including the MacMillan catalyst, paclitaxel, and lithium cobalt oxide. The results suggest that domain-knowledge prompts can guide LLMs to generate more accurate and relevant responses, highlighting the potential of LLMs as powerful tools for scientific discovery and innovation when equipped with domain-specific prompts. The study also discusses limitations and future directions for domain-specific prompt engineering development.
OCMar 17
Switched Linear Ensemble Systems and Structural ControllabilityHaoyu Yin, Yi Li, Ouyang Du et al.
This paper introduces and solves a structural controllability problem for ensembles of switched linear systems. All individual systems in the ensemble are sparse and governed by the same sparsity pattern, and undergo switching among subsystems by following the same switching sequence. The controllability of an ensemble system describes the ability to use a common control input to simultaneously steer every individual system. A sparsity pattern is called structurally controllable for pair \((k,q)\) if it admits a controllable ensemble of \(q\) individual systems with at most \(k\) subsystems. We derive a necessary and sufficient condition for a sparsity pattern to be structurally controllable for a given \((k,q)\), and characterize when a sparsity pattern admits a finite \(k\) that guarantees structural controllability for \((k,q)\) for arbitrary $q$. Compared with the linear time-invariant ensemble case, this second condition is strictly weaker. We further show that these conditions have natural connections with maximum flow, and hence can be checked by polynomial algorithms. Specifically, the time complexity of deciding structural controllability is \(O(n^3)\) and the complexity of computing the smallest number of subsystems needed is \(O(n^3 \log n)\), with \(n\) the dimension of each individual system.
DSApr 7
On Permanence of Conservative Replicator Dynamics with Four StrategiesHaoyu Yin, Xudong Chen, Bruno Sinopoli
In this paper, we study four-strategy conservative replicator dynamics induced by constant payoff matrices. We establish necessary and sufficient conditions for permanence to occur by associating the payoff matrix with its digraph, revealing exactly five distinct digraph classes governing the global behavior. We further show that, whenever the dynamics is permanent, every non-equilibrium trajectory in the relative interior of the simplex is a Lyapunov-stable periodic orbit. Together with the classification of the boundary phase portraits, these results provide a complete characterization of the global dynamics in the four-strategy case with permanence.