BMLGMLFeb 22, 2022

Structured Multi-task Learning for Molecular Property Prediction

arXiv:2203.04695v233 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses data scarcity in drug discovery by leveraging structured task relations, offering a domain-specific incremental improvement for molecular property prediction.

The paper tackles the problem of limited labeled data in multi-task learning for molecular property prediction by introducing a novel setting with a task relation graph, and proposes SGNN-EBM, which improves performance by modeling tasks in latent and output spaces, achieving empirical gains as validated on the constructed ChEMBL-STRING dataset with around 400 tasks.

Multi-task learning for molecular property prediction is becoming increasingly important in drug discovery. However, in contrast to other domains, the performance of multi-task learning in drug discovery is still not satisfying as the number of labeled data for each task is too limited, which calls for additional data to complement the data scarcity. In this paper, we study multi-task learning for molecular property prediction in a novel setting, where a relation graph between tasks is available. We first construct a dataset (ChEMBL-STRING) including around 400 tasks as well as a task relation graph. Then to better utilize such relation graph, we propose a method called SGNN-EBM to systematically investigate the structured task modeling from two perspectives. (1) In the \emph{latent} space, we model the task representations by applying a state graph neural network (SGNN) on the relation graph. (2) In the \emph{output} space, we employ structured prediction with the energy-based model (EBM), which can be efficiently trained through noise-contrastive estimation (NCE) approach. Empirical results justify the effectiveness of SGNN-EBM. Code is available on https://github.com/chao1224/SGNN-EBM.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes