Machine-learning a virus assembly fitness landscape
This work addresses the challenge of efficiently modeling virus assembly fitness landscapes for researchers in computational biology and virology, representing an incremental improvement by applying existing ML techniques to a specific domain.
The researchers tackled the problem of constructing realistic evolutionary fitness landscapes for virus assembly by using machine learning to predict assembly efficiency from genome sequences, achieving results in minutes with high accuracy compared to computationally expensive stochastic models.
Realistic evolutionary fitness landscapes are notoriously difficult to construct. A recent cutting-edge model of virus assembly consists of a dodecahedral capsid with $12$ corresponding packaging signals in three affinity bands. This whole genome/phenotype space consisting of $3^{12}$ genomes has been explored via computationally expensive stochastic assembly models, giving a fitness landscape in terms of the assembly efficiency. Using latest machine-learning techniques by establishing a neural network, we show that the intensive computation can be short-circuited in a matter of minutes to astounding accuracy.