Bayesian Optimization for Synthetic Gene Design
This work addresses gene design for synthetic biology applications, presenting an incremental improvement by adapting Bayesian optimization to this domain.
The authors tackled the problem of synthetic gene design by developing a Bayesian optimization approach to handle the intractable design space of long character strings, resulting in a method that optimizes multiple aspects simultaneously and was demonstrated in a real experiment with mammalian cells.
We address the problem of synthetic gene design using Bayesian optimization. The main issue when designing a gene is that the design space is defined in terms of long strings of characters of different lengths, which renders the optimization intractable. We propose a three-step approach to deal with this issue. First, we use a Gaussian process model to emulate the behavior of the cell. As inputs of the model, we use a set of biologically meaningful gene features, which allows us to define optimal gene designs rules. Based on the model outputs we define a multi-task acquisition function to optimize simultaneously severals aspects of interest. Finally, we define an evaluation function, which allow us to rank sets of candidate gene sequences that are coherent with the optimal design strategy. We illustrate the performance of this approach in a real gene design experiment with mammalian cells.