Model Inversion Networks for Model-Based Optimization
This addresses optimization challenges for domains like bioinformatics and computer vision where traditional methods struggle with out-of-distribution inputs, though it appears incremental as it builds on existing model-based optimization frameworks.
The paper tackles the problem of data-driven optimization in high-dimensional spaces where valid inputs are sparse, such as protein sequences or images, by proposing model inversion networks (MINs) that learn an inverse mapping from scores to inputs, achieving scalability and effectiveness across tasks like Bayesian optimization, image and protein design, and contextual bandit optimization.
In this work, we aim to solve data-driven optimization problems, where the goal is to find an input that maximizes an unknown score function given access to a dataset of inputs with corresponding scores. When the inputs are high-dimensional and valid inputs constitute a small subset of this space (e.g., valid protein sequences or valid natural images), such model-based optimization problems become exceptionally difficult, since the optimizer must avoid out-of-distribution and invalid inputs. We propose to address such problem with model inversion networks (MINs), which learn an inverse mapping from scores to inputs. MINs can scale to high-dimensional input spaces and leverage offline logged data for both contextual and non-contextual optimization problems. MINs can also handle both purely offline data sources and active data collection. We evaluate MINs on tasks from the Bayesian optimization literature, high-dimensional model-based optimization problems over images and protein designs, and contextual bandit optimization from logged data.