Bangchen Yin

LG
h-index14
4papers
10citations
Novelty51%
AI Score38

4 Papers

CLJun 21, 2023
Interactive Molecular Discovery with Natural Language

Zheni Zeng, Bangchen Yin, Shipeng Wang et al.

Natural language is expected to be a key medium for various human-machine interactions in the era of large language models. When it comes to the biochemistry field, a series of tasks around molecules (e.g., property prediction, molecule mining, etc.) are of great significance while having a high technical threshold. Bridging the molecule expressions in natural language and chemical language can not only hugely improve the interpretability and reduce the operation difficulty of these tasks, but also fuse the chemical knowledge scattered in complementary materials for a deeper comprehension of molecules. Based on these benefits, we propose the conversational molecular design, a novel task adopting natural language for describing and editing target molecules. To better accomplish this task, we design ChatMol, a knowledgeable and versatile generative pre-trained model, enhanced by injecting experimental property information, molecular spatial knowledge, and the associations between natural and chemical languages into it. Several typical solutions including large language models (e.g., ChatGPT) are evaluated, proving the challenge of conversational molecular design and the effectiveness of our knowledge enhancement method. Case observations and analysis are conducted to provide directions for further exploration of natural-language interaction in molecular discovery.

38.8LGMar 26
Hessian-informed machine learning interatomic potential towards bridging theory and experiments

Bangchen Yin, Jian Ouyang, Zhen Fan et al.

Local curvature of potential energy surfaces is critical for predicting certain experimental observables of molecules and materials from first principles, yet it remains far beyond reach for complex systems. In this work, we introduce a Hessian-informed Machine Learning Interatomic Potential (Hi-MLIP) that captures such curvature reliably, thereby enabling accurate analysis of associated thermodynamic and kinetic phenomena. To make Hessian supervision practically viable, we develop a highly efficient training protocol, termed Hessian INformed Training (HINT), achieving two to four orders of magnitude reduction for the requirement of expensive Hessian labels. HINT integrates critical techniques, including Hessian pre-training, configuration sampling, curriculum learning and stochastic projection Hessian loss. Enabled by HINT, Hi-MLIP significantly improves transition-state search and brings Gibbs free-energy predictions close to chemical accuracy especially in data-scarce regimes. Our framework also enables accurate treatment of strongly anharmonic hydrides, reproducing phonon renormalization and superconducting critical temperatures in close agreement with experiment while bypassing the computational bottleneck of anharmonic calculations. These results establish a practical route to enhancing curvature awareness of machine learning interatomic potentials, bridging simulation and experimental observables across a wide range of systems.

LGJan 13, 2025
AlphaNet: Scaling Up Local-frame-based Atomistic Interatomic Potential

Bangchen Yin, Jiaao Wang, Weitao Du et al.

Molecular dynamics simulations demand an unprecedented combination of accuracy and scalability to tackle grand challenges in catalysis and materials design. To bridge this gap, we present AlphaNet, a local-frame-based equivariant model that simultaneously improves computational efficiency and predictive precision for interatomic interactions. By constructing equivariant local frames with learnable geometric transitions, AlphaNet encodes atomic environments with enhanced representational capacity, achieving state-of-the-art accuracy in energy and force predictions. Extensive benchmarks on large-scale datasets spanning molecular reactions, crystal stability, and surface catalysis (Matbench Discovery and OC2M) demonstrate its superior performance over existing neural network interatomic potentials while ensuring scalability across diverse system sizes with varying types of interatomic interactions. The synergy of accuracy, efficiency, and transferability positions AlphaNet as a transformative tool for modeling multiscale phenomena, decoding dynamics in catalysis and functional interfaces, with direct implications for accelerating the discovery of complex molecular systems and functional materials.

LGMar 26, 2024
EL-MLFFs: Ensemble Learning of Machine Leaning Force Fields

Bangchen Yin, Yue Yin, Yuda W. Tang et al.

Machine learning force fields (MLFFs) have emerged as a promising approach to bridge the accuracy of quantum mechanical methods and the efficiency of classical force fields. However, the abundance of MLFF models and the challenge of accurately predicting atomic forces pose significant obstacles in their practical application. In this paper, we propose a novel ensemble learning framework, EL-MLFFs, which leverages the stacking method to integrate predictions from diverse MLFFs and enhance force prediction accuracy. By constructing a graph representation of molecular structures and employing a graph neural network (GNN) as the meta-model, EL-MLFFs effectively captures atomic interactions and refines force predictions. We evaluate our approach on two distinct datasets: methane molecules and methanol adsorbed on a Cu(100) surface. The results demonstrate that EL-MLFFs significantly improves force prediction accuracy compared to individual MLFFs, with the ensemble of all eight models yielding the best performance. Moreover, our ablation study highlights the crucial roles of the residual network and graph attention layers in the model's architecture. The EL-MLFFs framework offers a promising solution to the challenges of model selection and force prediction accuracy in MLFFs, paving the way for more reliable and efficient molecular simulations.