Bayesian Optimization for Molecules Should Be Pareto-Aware
This work provides concrete evidence for the practical advantages of Pareto-aware methods in molecular optimization, especially for limited evaluation budgets and nontrivial trade-offs.
The researchers tackled the problem of multi-objective Bayesian optimization (MOBO) for molecular design by benchmarking a Pareto-based strategy (EHVI) against a scalarized baseline (EI) under controlled conditions. They found that EHVI consistently outperformed EI across three tasks, with better Pareto front coverage, faster convergence, and higher chemical diversity.
Multi-objective Bayesian optimization (MOBO) provides a principled framework for navigating trade-offs in molecular design. However, its empirical advantages over scalarized alternatives remain underexplored. We benchmark a simple Pareto-based MOBO strategy -- Expected Hypervolume Improvement (EHVI) -- against a simple fixed-weight scalarized baseline using Expected Improvement (EI), under a tightly controlled setup with identical Gaussian Process surrogates and molecular representations. Across three molecular optimization tasks, EHVI consistently outperforms scalarized EI in terms of Pareto front coverage, convergence speed, and chemical diversity. While scalarization encompasses flexible variants -- including random or adaptive schemes -- our results show that even strong deterministic instantiations can underperform in low-data regimes. These findings offer concrete evidence for the practical advantages of Pareto-aware acquisition in de novo molecular optimization, especially when evaluation budgets are limited and trade-offs are nontrivial.