DARWIN 1.5: Large Language Models as Materials Science Adapted Learners
This addresses the challenge of generalizability and practical applicability in materials science for researchers and engineers, though it appears incremental as an adaptation of existing LLM technology to a specific domain.
The paper tackles the problem of materials discovery by developing DARWIN 1.5, a large language model tailored for materials science that uses natural language input instead of task-specific descriptors, achieving up to 59.1% improvement in prediction accuracy over the base LLaMA-7B architecture and outperforming state-of-the-art machine learning approaches across 8 materials design tasks.
Materials discovery and design aim to find compositions and structures with desirable properties over highly complex and diverse physical spaces. Traditional solutions, such as high-throughput simulations or machine learning, often rely on complex descriptors, which hinder generalizability and transferability across different material systems. Moreover, These descriptors may inadequately represent macro-scale material properties, which are influenced by structural imperfections and compositional variations in real-world samples, thus limiting their practical applicability. To address these challenges, we propose DARWIN 1.5, the largest open-source large language model tailored for materials science. By leveraging natural language as input, DARWIN eliminates the need for task-specific descriptors and enables a flexible, unified approach to material property prediction and discovery. Our approach integrates 6M material domain papers and 21 experimental datasets from 49,256 materials across modalities while enabling cross-task knowledge transfer. The enhanced model achieves up to 59.1% improvement in prediction accuracy over the base LLaMA-7B architecture and outperforms SOTA machine learning approaches across 8 materials design tasks. These results establish LLMs as a promising foundation for developing versatile and scalable models in materials science.