CL AIJul 26, 2024

Many-Shot In-Context Learning for Molecular Inverse Design

Saeed Moayedpour, Alejandro Corrochano-Navarro, Faryad Sahneh, Shahriar Noroozizadeh, Alexander Koetter, Jiri Vymetal, Lorenzo Kogler-Anele, Pablo Mas, Yasser Jangjou, Sizhen Li, Michael Bailey, Marc Bianciotto

arXiv:2407.19089v15.57 citationsh-index: 20

Originality Incremental advance

AI Analysis

This work addresses the challenge of molecular inverse design for scientists by providing an accessible and easy-to-use method that enhances ICL capabilities, though it appears incremental as it builds on existing LLM and ICL frameworks.

The paper tackled the problem of limited experimental data for many-shot in-context learning in molecular inverse design by developing a semi-supervised method that iteratively includes LLM-generated molecules with high predicted performance, resulting in significant improvements over existing ICL methods for molecular design.

Large Language Models (LLMs) have demonstrated great performance in few-shot In-Context Learning (ICL) for a variety of generative and discriminative chemical design tasks. The newly expanded context windows of LLMs can further improve ICL capabilities for molecular inverse design and lead optimization. To take full advantage of these capabilities we developed a new semi-supervised learning method that overcomes the lack of experimental data available for many-shot ICL. Our approach involves iterative inclusion of LLM generated molecules with high predicted performance, along with experimental data. We further integrated our method in a multi-modal LLM which allows for the interactive modification of generated molecular structures using text instructions. As we show, the new method greatly improves upon existing ICL methods for molecular design while being accessible and easy to use for scientists.

View on arXiv PDF

Similar