Is One Epoch All You Need For Multi-Fidelity Hyperparameter Optimization?
This work addresses cost reduction in hyperparameter optimization for machine learning practitioners, but it is incremental as it highlights a baseline rather than introducing a new method.
The paper tackled the computational expense of multi-fidelity hyperparameter optimization by comparing a simple baseline that discards all but top models after one epoch against established methods, finding it achieved similar results with an order of magnitude less computation.
Hyperparameter optimization (HPO) is crucial for fine-tuning machine learning models but can be computationally expensive. To reduce costs, Multi-fidelity HPO (MF-HPO) leverages intermediate accuracy levels in the learning process and discards low-performing models early on. We compared various representative MF-HPO methods against a simple baseline on classical benchmark data. The baseline involved discarding all models except the Top-K after training for only one epoch, followed by further training to select the best model. Surprisingly, this baseline achieved similar results to its counterparts, while requiring an order of magnitude less computation. Upon analyzing the learning curves of the benchmark data, we observed a few dominant learning curves, which explained the success of our baseline. This suggests that researchers should (1) always use the suggested baseline in benchmarks and (2) broaden the diversity of MF-HPO benchmarks to include more complex cases.