Bencher: Simple and Reproducible Benchmarking for Black-Box Optimization
This addresses the need for reproducible and scalable benchmarking in optimization research, though it is incremental as it builds on existing benchmarking suites with a new modular design.
The paper tackles the problem of benchmarking black-box optimization by introducing Bencher, a framework that decouples benchmark execution from optimization logic to eliminate dependency conflicts and simplify integration, resulting in support for 80 benchmarks across various domains with minimal setup.
We present Bencher, a modular benchmarking framework for black-box optimization that fundamentally decouples benchmark execution from optimization logic. Unlike prior suites that focus on combining many benchmarks in a single project, Bencher introduces a clean abstraction boundary: each benchmark is isolated in its own virtual Python environment and accessed via a unified, version-agnostic remote procedure call (RPC) interface. This design eliminates dependency conflicts and simplifies the integration of diverse, real-world benchmarks, which often have complex and conflicting software requirements. Bencher can be deployed locally or remotely via Docker or on high-performance computing (HPC) clusters via Singularity, providing a containerized, reproducible runtime for any benchmark. Its lightweight client requires minimal setup and supports drop-in evaluation of 80 benchmarks across continuous, categorical, and binary domains.