EvoVGM: a Deep Variational Generative Model for Evolutionary Parameter Estimation
This work addresses evolutionary parameter estimation for computational biologists, offering a novel integration of deep learning with phylogenetic inference, though it is incremental as it builds on existing substitution models.
The authors tackled the problem of evolutionary parameter estimation in biological sequences by proposing EvoVGM, a deep variational generative model that jointly approximates posterior distributions and generates alignments, showing effectiveness on synthetic data and robustness on coronavirus gene S alignments.
Most evolutionary-oriented deep generative models do not explicitly consider the underlying evolutionary dynamics of biological sequences as it is performed within the Bayesian phylogenetic inference framework. In this study, we propose a method for a deep variational Bayesian generative model (EvoVGM) that jointly approximates the true posterior of local evolutionary parameters and generates sequence alignments. Moreover, it is instantiated and tuned for continuous-time Markov chain substitution models such as JC69, K80 and GTR. We train the model via a low-variance stochastic estimator and a gradient ascent algorithm. Here, we analyze the consistency and effectiveness of EvoVGM on synthetic sequence alignments simulated with several evolutionary scenarios and different sizes. Finally, we highlight the robustness of a fine-tuned EvoVGM model using a sequence alignment of gene S of coronaviruses.