Leaping through tree space: continuous phylogenetic inference for rooted and unrooted trees
This addresses the problem of data-deficient phylogenetic inference for life sciences, offering a novel continuous approach that is incremental in applying gradient-based optimization to a discrete problem.
The paper tackled the challenge of finding suitable phylogenies from the vast space of possible trees by performing tree exploration and inference in a continuous space for the first time, allowing for major leaps and reducing susceptibility to local minima. It outperformed current best methods for unrooted trees and accurately inferred trees and roots in ultrametric simulations, demonstrating effectiveness with negligible data on the jawed vertebrate phylogeny.
Phylogenetics is now fundamental in life sciences, providing insights into the earliest branches of life and the origins and spread of epidemics. However, finding suitable phylogenies from the vast space of possible trees remains challenging. To address this problem, for the first time, we perform both tree exploration and inference in a continuous space where the computation of gradients is possible. This continuous relaxation allows for major leaps across tree space in both rooted and unrooted trees, and is less susceptible to convergence to local minima. Our approach outperforms the current best methods for inference on unrooted trees and, in simulation, accurately infers the tree and root in ultrametric cases. The approach is effective in cases of empirical data with negligible amounts of data, which we demonstrate on the phylogeny of jawed vertebrates. Indeed, only a few genes with an ultrametric signal were generally sufficient for resolving the major lineages of vertebrates. Optimisation is possible via automatic differentiation and our method presents an effective way forwards for exploring the most difficult, data-deficient phylogenetic questions.