AI MTRL-SCI CLAug 31, 2025

Aligning Reasoning LLMs for Materials Discovery with Physics-aware Rejection Sampling

Lee Hyun, Sohee Yoon, Jinwoo Park, Sue In Chae, Seongeon Park, Jooyeon Ahn, Yebin Jung, Youjung Chung, Hogeun Chang, Sujin Park, Myeonginn Kang, Jina Kim

arXiv:2509.00768v23.31 citationsh-index: 9

Originality Incremental advance

AI Analysis

This work addresses the need for reliable and efficient AI-driven materials discovery, though it is incremental as it builds on existing reasoning model frameworks with domain-specific enhancements.

The paper tackled the problem of improving reasoning large language models for materials discovery by introducing Physics-aware Rejection Sampling (PaRS), a training-time trace selection method that favors physically admissible and accurate reasoning traces, resulting in improved accuracy, calibration, reduced physics-violation rates, and lower sampling costs compared to baselines.

AI-driven materials discovery that couples automated experimentation with algorithmic decision-making requires process aware recipe to property predictors that are accurate, calibrated, and physically admissible. We approach this as a reasoning problem with large reasoning models (LRMs). To instill reasoning capability into language models, we curate reasoning traces from a teacher model to train a student model. However, most training pipelines select reasoning traces using binary correctness or learned preference signals that poorly reflect physical admissibility. We introduce Physics-aware Rejection Sampling (PaRS), a training-time trace selection scheme that favors traces consistent with fundamental physics and numerically close to targets, with lightweight halting to control compute. We instantiate our framework with a large student model fine-tuned on traces synthesized by a larger teacher model, and evaluate under matched token budgets against various rejection sampling baselines. Our method improves accuracy and calibration, reduces physics-violation rates, and lowers sampling cost relative to baselines. These results indicate that modest, domain-aware constraints combined with trace-level selection provide a practical path toward reliable, efficient LRMs for process-aware property prediction and closed-loop materials design.

View on arXiv PDF

Similar