Accurate Prediction of Ligand-Protein Interaction Affinities with Fine-Tuned Small Language Models
This work addresses the challenge of accelerating drug discovery by providing more accurate predictions for drug-target interactions, though it appears incremental as it builds on existing language model techniques.
The paper tackled the problem of predicting ligand-protein interaction affinities using fine-tuned small language models, achieving improved accuracy over existing machine learning and free-energy perturbation methods in a zero-shot setting.
We describe the accurate prediction of ligand-protein interaction (LPI) affinities, also known as drug-target interactions (DTI), with instruction fine-tuned pretrained generative small language models (SLMs). We achieved accurate predictions for a range of affinity values associated with ligand-protein interactions on out-of-sample data in a zero-shot setting. Only the SMILES string of the ligand and the amino acid sequence of the protein were used as the model inputs. Our results demonstrate a clear improvement over machine learning (ML) and free-energy perturbation (FEP+) based methods in accurately predicting a range of ligand-protein interaction affinities, which can be leveraged to further accelerate drug discovery campaigns against challenging therapeutic targets.