Learning Flexible Forward Trajectories for Masked Molecular Diffusion
This addresses a critical bottleneck in generating valid molecules for drug discovery and materials science, representing a significant improvement over existing methods.
The paper tackled the poor performance of masked diffusion models in molecular generation by identifying a state-clashing problem where distinct molecules collapse into common states during forward diffusion, and proposed MELD with per-element corruption trajectories to avoid this, increasing chemical validity from 15% to 93% on ZINC250K and achieving state-of-the-art property alignment.
Masked diffusion models (MDMs) have achieved notable progress in modeling discrete data, while their potential in molecular generation remains underexplored. In this work, we explore their potential and introduce the surprising result that naively applying standards MDMs severely degrades the performance. We identify the critical cause of this issue as a state-clashing problem-where the forward diffusion of distinct molecules collapse into a common state, resulting in a mixture of reconstruction targets that cannot be learned using typical reverse diffusion process with unimodal predictions. To mitigate this, we propose Masked Element-wise Learnable Diffusion (MELD) that orchestrates per-element corruption trajectories to avoid collision between distinct molecular graphs. This is achieved through a parameterized noise scheduling network that assigns distinct corruption rates to individual graph elements, i.e., atoms and bonds. Extensive experiments on diverse molecular benchmarks reveal that MELD markedly enhances overall generation quality compared to element-agnostic noise scheduling, increasing the chemical validity of vanilla MDMs on ZINC250K from 15% to 93%, Furthermore, it achieves state-of-the-art property alignment in conditional generation tasks.