Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design
This work addresses catalyst design for researchers in chemistry and materials science, representing an incremental advance by applying an existing search method to a new domain with LLMs.
The paper tackles the problem of discovering novel catalysts by addressing the combinatorial growth in search space due to multiple chemical properties and trade-offs, using a Monte Carlo Tree Search-based approach with large language models to improve scientific reasoning, achieving a 25.8% improvement over the best baseline.
Discovering novel catalysts requires complex reasoning involving multiple chemical properties and resultant trade-offs, leading to a combinatorial growth in the search space. While large language models (LLM) have demonstrated novel capabilities for chemistry through complex instruction following capabilities and high quality reasoning, a goal-driven combinatorial search using LLMs has not been explored in detail. In this work, we present a Monte Carlo Tree Search-based approach that improves beyond state-of-the-art chain-of-thought prompting variants to augment scientific reasoning. We introduce two new reasoning datasets: 1) a curation of computational chemistry simulations, and 2) diverse questions written by catalysis researchers for reasoning about novel chemical conversion processes. We improve over the best baseline by 25.8\% and find that our approach can augment scientist's reasoning and discovery process with novel insights.