Multi-Objective Coverage via Constraint Active Search
This addresses the need for faster evaluation in critical applications like drug discovery and materials design, though it is incremental as it builds on existing active search and multi-objective optimization methods.
The paper tackles the multi-objective coverage (MOC) problem by proposing MOC-CAS, a novel search algorithm that identifies a small set of representative samples covering the feasible multi-objective space, achieving superior performance on large-scale protein-target datasets for SARS-CoV-2 and cancer with five objectives.
In this paper, we formulate the new multi-objective coverage (MOC) problem where our goal is to identify a small set of representative samples whose predicted outcomes broadly cover the feasible multi-objective space. This problem is of great importance in many critical real-world applications, e.g., drug discovery and materials design, as this representative set can be evaluated much faster than the whole feasible set, thus significantly accelerating the scientific discovery process. Existing works cannot be directly applied as they either focus on sample space coverage or multi-objective optimization that targets the Pareto front. However, chemically diverse samples often yield identical objective profiles, and safety constraints are usually defined on the objectives. To solve this MOC problem, we propose a novel search algorithm, MOC-CAS, which employs an upper confidence bound-based acquisition function to select optimistic samples guided by Gaussian process posterior predictions. For enabling efficient optimization, we develop a smoothed relaxation of the hard feasibility test and derive an approximate optimizer. Compared to the competitive baselines, we show that our MOC-CAS empirically achieves superior performances across large-scale protein-target datasets for SARS-CoV-2 and cancer, each assessed on five objectives derived from SMILES-based features.