49.8CLMay 28Code
From Blind Guess to Informed Judgment: Teaching LLMs to Evaluate Materials by Building Knowledge-Augmented Preference SignalsYeyong Yu, Wenya Hu, Xing Wu et al.
As candidate generation and high-throughput experimentation advance, the primary bottleneck in materials discovery is shifting from property prediction to making reliable evaluations among massive candidate sets. We propose a Knowledge-Augmented Preference Signals Framework, MaterEval, that automatically produces, for the same candidate, two evaluations: an informed judgment that follows expert rules and provides supporting evidence, and a rule-removed blind guess. By pairing the two evaluations as preference data, we guide general-purpose large language models (LLMs), originally lacking materials-specific criteria, from intuitive judgment toward reliable evaluation supported by explicit evidence. To balance throughput, cost, and reliability, we further introduce a fast-slow reasoning scheme that decouples large-scale rapid screening from in-depth review on a small subset. Using high-entropy alloy (HEA) assessment as a case study, we show that, without external retrieval and relying solely on internalized capabilities, small open-source LLMs achieve substantial gains in accuracy, conclusion consistency, and evidence discrimination, approaching the performance of rule-based closed-source LLMs. These results demonstrate that expert rules can be systematically transformed into learnable preference signals, enabling a low-cost and deployable evaluation module for autonomous materials discovery loops.
CLAug 20, 2024Code
BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language ModelYeyong Yu, Runsheng Yu, Haojie Wei et al.
The rapid advancement of large language models (LLMs) has revolutionized role-playing, enabling the development of general role-playing models. However, current role-playing training has two significant issues: (I) Using a predefined role profile to prompt dialogue training for specific scenarios usually leads to inconsistencies and even conflicts between the dialogue and the profile, resulting in training biases. (II) The model learns to imitate the role based solely on the profile, neglecting profile-dialogue alignment at the sentence level. In this work, we propose a simple yet effective framework called BEYOND DIALOGUE, designed to overcome these hurdles. This framework innovatively introduces "beyond dialogue" tasks to align dialogue with profile traits based on each specific scenario, thereby eliminating biases during training. Furthermore, by adopting an innovative prompting mechanism that generates reasoning outcomes for training, the framework allows the model to achieve fine-grained alignment between profile and dialogue at the sentence level. The aforementioned methods are fully automated and low-cost. Additionally, the integration of automated dialogue and objective evaluation methods forms a comprehensive framework, paving the way for general role-playing. Experimental results demonstrate that our model excels in adhering to and reflecting various dimensions of role profiles, outperforming most proprietary general and specialized role-playing baselines. All code and datasets are available at https://github.com/yuyouyu32/BeyondDialogue.
LGJun 17, 2025
AIMatDesign: Knowledge-Augmented Reinforcement Learning for Inverse Materials Design under Data ScarcityYeyong Yu, Xilei Bian, Jie Xiong et al.
With the growing demand for novel materials, machine learning-driven inverse design methods face significant challenges in reconciling the high-dimensional materials composition space with limited experimental data. Existing approaches suffer from two major limitations: (I) machine learning models often lack reliability in high-dimensional spaces, leading to prediction biases during the design process; (II) these models fail to effectively incorporate domain expert knowledge, limiting their capacity to support knowledge-guided inverse design. To address these challenges, we introduce AIMatDesign, a reinforcement learning framework that addresses these limitations by augmenting experimental data using difference-based algorithms to build a trusted experience pool, accelerating model convergence. To enhance model reliability, an automated refinement strategy guided by large language models (LLMs) dynamically corrects prediction inconsistencies, reinforcing alignment between reward signals and state value functions. Additionally, a knowledge-based reward function leverages expert domain rules to improve stability and efficiency during training. Our experiments demonstrate that AIMatDesign significantly surpasses traditional machine learning and reinforcement learning methods in discovery efficiency, convergence speed, and success rates. Among the numerous candidates proposed by AIMatDesign, experimental synthesis of representative Zr-based alloys yielded a top-performing BMG with 1.7GPa yield strength and 10.2\% elongation, closely matching predictions. Moreover, the framework accurately captured the trend of yield strength variation with composition, demonstrating its reliability and potential for closed-loop materials discovery.