CATBench: A Compiler Autotuning Benchmarking Suite for Black-box Optimization
This addresses a gap for researchers in Bayesian optimization and compiler autotuning by providing a unified, reproducible benchmarking tool, though it is incremental as it builds on existing methods for benchmarking.
The authors tackled the lack of standardized benchmarks for Bayesian optimization in compiler autotuning by introducing CATBench, a comprehensive suite that captures complex structural challenges and spans various machine learning computations, validated on state-of-the-art algorithms to reveal their strengths and weaknesses.
Bayesian optimization is a powerful method for automating tuning of compilers. The complex landscape of autotuning provides a myriad of rarely considered structural challenges for black-box optimizers, and the lack of standardized benchmarks has limited the study of Bayesian optimization within the domain. To address this, we present CATBench, a comprehensive benchmarking suite that captures the complexities of compiler autotuning, ranging from discrete, conditional, and permutation parameter types to known and unknown binary constraints, as well as both multi-fidelity and multi-objective evaluations. The benchmarks in CATBench span a range of machine learning-oriented computations, from tensor algebra to image processing and clustering, and uses state-of-the-art compilers, such as TACO and RISE/ELEVATE. CATBench offers a unified interface for evaluating Bayesian optimization algorithms, promoting reproducibility and innovation through an easy-to-use, fully containerized setup of both surrogate and real-world compiler optimization tasks. We validate CATBench on several state-of-the-art algorithms, revealing their strengths and weaknesses and demonstrating the suite's potential for advancing both Bayesian optimization and compiler autotuning research.