AI DS LGJun 21, 2020

Refined bounds for algorithm configuration: The knife-edge of dual class approximability

Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik

arXiv:2006.11827v215.816 citations

Originality Incremental advance

AI Analysis

This work addresses a fundamental issue in automating algorithm configuration for researchers and practitioners, providing both theoretical insights and practical improvements, though it is incremental in refining existing bounds.

The paper tackles the problem of determining the required training set size for algorithm configuration to ensure that empirical performance generalizes to future instances, showing that strong sample complexity bounds are possible under L-infinity approximation but not under L-p norms for p < infinity. In experiments on integer programming, they achieve sample complexity bounds up to 700 times smaller than prior best-known bounds.

Automating algorithm configuration is growing increasingly necessary as algorithms come with more and more tunable parameters. It is common to tune parameters using machine learning, optimizing performance metrics such as runtime and solution quality. The training set consists of problem instances from the specific domain at hand. We investigate a fundamental question about these techniques: how large should the training set be to ensure that a parameter's average empirical performance over the training set is close to its expected, future performance? We answer this question for algorithm configuration problems that exhibit a widely-applicable structure: the algorithm's performance as a function of its parameters can be approximated by a "simple" function. We show that if this approximation holds under the L-infinity norm, we can provide strong sample complexity bounds. On the flip side, if the approximation holds only under the L-p norm for p smaller than infinity, it is not possible to provide meaningful sample complexity bounds in the worst case. We empirically evaluate our bounds in the context of integer programming, one of the most powerful tools in computer science. Via experiments, we obtain sample complexity bounds that are up to 700 times smaller than the previously best-known bounds.

View on arXiv PDF

Similar