LLM-Fusion: A Novel Multimodal Fusion Model for Accelerated Material Discovery
This work addresses the challenge of accelerating material discovery for materials scientists, representing an incremental improvement through a novel fusion method.
The paper tackles the problem of inefficient material discovery by proposing LLM-Fusion, a multimodal fusion model that integrates diverse material representations using large language models, achieving higher accuracy in property prediction compared to traditional methods across two datasets and five tasks.
Discovering materials with desirable properties in an efficient way remains a significant problem in materials science. Many studies have tackled this problem by using different sets of information available about the materials. Among them, multimodal approaches have been found to be promising because of their ability to combine different sources of information. However, fusion algorithms to date remain simple, lacking a mechanism to provide a rich representation of multiple modalities. This paper presents LLM-Fusion, a novel multimodal fusion model that leverages large language models (LLMs) to integrate diverse representations, such as SMILES, SELFIES, text descriptions, and molecular fingerprints, for accurate property prediction. Our approach introduces a flexible LLM-based architecture that supports multimodal input processing and enables material property prediction with higher accuracy than traditional methods. We validate our model on two datasets across five prediction tasks and demonstrate its effectiveness compared to unimodal and naive concatenation baselines.