AIJul 29, 2024

Hashing based Contrastive Learning for Virtual Screening

arXiv:2407.19790v1h-index: 1
Originality Highly original
AI Analysis

This addresses the problem of efficient large-scale molecular screening for drug discovery researchers, offering a novel method for a known bottleneck.

The paper tackles the high memory and time costs of virtual screening for drug discovery by proposing DrugHash, a hashing-based contrastive learning method that uses binary hash codes for retrieval, achieving state-of-the-art accuracy with 32× memory saving and 3.5× speed improvement.

Virtual screening (VS) is a critical step in computer-aided drug discovery, aiming to identify molecules that bind to a specific target receptor like protein. Traditional VS methods, such as docking, are often too time-consuming for screening large-scale molecular databases. Recent advances in deep learning have demonstrated that learning vector representations for both proteins and molecules using contrastive learning can outperform traditional docking methods. However, given that target databases often contain billions of molecules, real-valued vector representations adopted by existing methods can still incur significant memory and time costs in VS. To address this problem, in this paper we propose a hashing-based contrastive learning method, called DrugHash, for VS. DrugHash treats VS as a retrieval task that uses efficient binary hash codes for retrieval. In particular, DrugHash designs a simple yet effective hashing strategy to enable end-to-end learning of binary hash codes for both protein and molecule modalities, which can dramatically reduce the memory and time costs with higher accuracy compared with existing methods. Experimental results show that DrugHash can outperform existing methods to achieve state-of-the-art accuracy, with a memory saving of 32$\times$ and a speed improvement of 3.5$\times$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes