Nakul Rampal

AI
h-index59
4papers
144citations
Novelty56%
AI Score42

4 Papers

AIJun 20, 2023
A GPT-4 Reticular Chemist for Guiding MOF Discovery

Zhiling Zheng, Zichao Rong, Nakul Rampal et al.

We present a new framework integrating the AI model GPT-4 into the iterative process of reticular chemistry experimentation, leveraging a cooperative workflow of interaction between AI and a human researcher. This GPT-4 Reticular Chemist is an integrated system composed of three phases. Each of these utilizes GPT-4 in various capacities, wherein GPT-4 provides detailed instructions for chemical experimentation and the human provides feedback on the experimental outcomes, including both success and failures, for the in-context learning of AI in the next iteration. This iterative human-AI interaction enabled GPT-4 to learn from the outcomes, much like an experienced chemist, by a prompt-learning strategy. Importantly, the system is based on natural language for both development and operation, eliminating the need for coding skills, and thus, make it accessible to all chemists. Our collaboration with GPT-4 Reticular Chemist guided the discovery of an isoreticular series of MOFs, with each synthesis fine-tuned through iterative feedback and expert suggestions. This workflow presents a potential for broader applications in scientific research by harnessing the capability of large language models like GPT-4 to enhance the feasibility and efficiency of research activities.

75.7MTRL-SCIMar 20
A chemical language model for reticular materials design

Dhruv Menon, Vivek Singh, Xu Chen et al.

Reticular chemistry has enabled the synthesis of tens of thousands of metal-organic frameworks (MOFs), yet the discovery of new materials still relies largely on intuition-driven linker design and iterative experimentation. As a result, researchers explore only a small fraction of the vast chemical space accessible to reticular materials, limiting the systematic discovery of frameworks with targeted properties. Here, we introduce Nexerra-R1, a building-block chemical language model that enables inverse design in reticular chemistry through the targeted generation of organic linkers. Rather than generating complete frameworks directly, Nexerra-R1 operates at the level of molecular building blocks, preserving the modular logic that underpins reticular synthesis. The model supports both unconstrained generation of low-connectivity linkers and scaffold-constrained design of symmetric multidentate motifs compatible with predefined nodes and topologies. We further combine linker generation with flow-guided distributional targeting to steer the generative process toward application-relevant objectives while maintaining chemical validity and assembly feasibility. The generated linkers are subsequently assembled into three-dimensional frameworks and are structurally optimized to produce candidate materials compatible with experimental synthesis. Using Nexerra-R1, we validate this strategy by rediscovering known MOFs and by proposing the experimental synthesis of a previously unreported framework, CU-525, generated entirely in silico. Together, these results establish a general inverse-design paradigm for reticular materials in which controllable chemical language modelling enables the direct translation from computational design to synthesizable frameworks.

CLMay 3, 2024Code
Single and Multi-Hop Question-Answering Datasets for Reticular Chemistry with GPT-4-Turbo

Nakul Rampal, Kaiyu Wang, Matthew Burigana et al.

The rapid advancement in artificial intelligence and natural language processing has led to the development of large-scale datasets aimed at benchmarking the performance of machine learning models. Herein, we introduce 'RetChemQA,' a comprehensive benchmark dataset designed to evaluate the capabilities of such models in the domain of reticular chemistry. This dataset includes both single-hop and multi-hop question-answer pairs, encompassing approximately 45,000 Q&As for each type. The questions have been extracted from an extensive corpus of literature containing about 2,530 research papers from publishers including NAS, ACS, RSC, Elsevier, and Nature Publishing Group, among others. The dataset has been generated using OpenAI's GPT-4 Turbo, a cutting-edge model known for its exceptional language understanding and generation capabilities. In addition to the Q&A dataset, we also release a dataset of synthesis conditions extracted from the corpus of literature used in this study. The aim of RetChemQA is to provide a robust platform for the development and evaluation of advanced machine learning algorithms, particularly for the reticular chemistry community. The dataset is structured to reflect the complexities and nuances of real-world scientific discourse, thereby enabling nuanced performance assessments across a variety of tasks. The dataset is available at the following link: https://github.com/nakulrampal/RetChemQA

AIDec 9, 2023
Image and Data Mining in Reticular Chemistry Using GPT-4V

Zhiling Zheng, Zhiguo He, Omar Khattab et al.

The integration of artificial intelligence into scientific research has reached a new pinnacle with GPT-4V, a large language model featuring enhanced vision capabilities, accessible through ChatGPT or an API. This study demonstrates the remarkable ability of GPT-4V to navigate and obtain complex data for metal-organic frameworks, especially from graphical sources. Our approach involved an automated process of converting 346 scholarly articles into 6240 images, which represents a benchmark dataset in this task, followed by deploying GPT-4V to categorize and analyze these images using natural language prompts. This methodology enabled GPT-4V to accurately identify and interpret key plots integral to MOF characterization, such as nitrogen isotherms, PXRD patterns, and TGA curves, among others, with accuracy and recall above 93%. The model's proficiency in extracting critical information from these plots not only underscores its capability in data mining but also highlights its potential in aiding the creation of comprehensive digital databases for reticular chemistry. In addition, the extracted nitrogen isotherm data from the selected literature allowed for a comparison between theoretical and experimental porosity values for over 200 compounds, highlighting certain discrepancies and underscoring the importance of integrating computational and experimental data. This work highlights the potential of AI in accelerating scientific discovery and innovation, bridging the gap between computational tools and experimental research, and paving the way for more efficient, inclusive, and comprehensive scientific inquiry.