Diffusion Models in Bioinformatics: A New Wave of Deep Learning Revolution in Action
It synthesizes existing knowledge to aid researchers in bioinformatics and computational biology, but is incremental as it reviews rather than introduces new methods.
This review addresses the lack of an overview of diffusion models in bioinformatics by providing a comprehensive survey of their applications across domains like cryo-EM data enhancement, protein design, and drug discovery, highlighting their potential for further development in the field.
Denoising diffusion models have emerged as one of the most powerful generative models in recent years. They have achieved remarkable success in many fields, such as computer vision, natural language processing (NLP), and bioinformatics. Although there are a few excellent reviews on diffusion models and their applications in computer vision and NLP, there is a lack of an overview of their applications in bioinformatics. This review aims to provide a rather thorough overview of the applications of diffusion models in bioinformatics to aid their further development in bioinformatics and computational biology. We start with an introduction of the key concepts and theoretical foundations of three cornerstone diffusion modeling frameworks (denoising diffusion probabilistic models, noise-conditioned scoring networks, and stochastic differential equations), followed by a comprehensive description of diffusion models employed in the different domains of bioinformatics, including cryo-EM data enhancement, single-cell data analysis, protein design and generation, drug and small molecule design, and protein-ligand interaction. The review is concluded with a summary of the potential new development and applications of diffusion models in bioinformatics.