MolLIBRA: Genetic Molecular Optimization with Multi-Fingerprint Surrogates and Text-Molecule Aligned Critic

arXiv:2602.07002v1h-index: 8
Originality Incremental advance
AI Analysis

This work addresses the problem of optimizing molecular properties efficiently for researchers in drug discovery, representing an incremental improvement with novel integration of methods.

The paper tackled sample-efficient molecular optimization with a limited budget of oracle evaluations by proposing MolLIBRA, a genetic algorithm framework that uses multi-fingerprint surrogates and a text-molecule aligned critic for pre-ranking candidates. It achieved the best Top-10 AUC on 14 out of 22 tasks and the highest overall sum of Top-10 AUC across tasks on the PMO-1K benchmark.

We study sample-efficient molecular optimization under a limited budget of oracle evaluations. We propose MolLIBRA (MultimOdaLity and Language Integrated Bayesian and evolutionaRy optimizAtion), a genetic algorithm based framework that pre-ranks candidate molecules using multiple critics before oracle calls: (i) an ensemble of Gaussian process (GP) surrogates defined over multiple molecular fingerprints and (ii) a pretrained text-molecule aligned encoder CLAMP. The GP ensemble enables adaptive selection of task-appropriate fingerprints, while CLAMP provides a zero-shot scoring signal from task descriptions by measuring the similarity between molecular and text embeddings. On the Practical Molecular Optimization (PMO) benchmark with a budget of 1,000 evaluations (PMO-1K), MolLIBRA-L, our variant with a language-model-based candidate generator, attains the best Top-10 AUC on 14/22 tasks and the highest overall sum of Top-10 AUC across tasks among prior methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes