LGMLSep 5, 2018

Coverage-Based Designs Improve Sample Mining and Hyper-Parameter Optimization

arXiv:1809.01712v3
Originality Incremental advance
AI Analysis

This work addresses the need for better initial sampling in sequential optimization for ML practitioners, offering incremental improvements over discrepancy-based approaches.

The paper tackled the problem of improving exploratory sampling in machine learning tasks like sample mining and hyper-parameter optimization by introducing coverage-based designs, such as Poisson disk sampling, which consistently outperformed existing methods in experiments.

Sampling one or more effective solutions from large search spaces is a recurring idea in machine learning, and sequential optimization has become a popular solution. Typical examples include data summarization, sample mining for predictive modeling and hyper-parameter optimization. Existing solutions attempt to adaptively trade-off between global exploration and local exploitation, wherein the initial exploratory sample is critical to their success. While discrepancy-based samples have become the de facto approach for exploration, results from computer graphics suggest that coverage-based designs, e.g. Poisson disk sampling, can be a superior alternative. In order to successfully adopt coverage-based sample designs to ML applications, which were originally developed for 2-d image analysis, we propose fundamental advances by constructing a parameterized family of designs with provably improved coverage characteristics, and by developing algorithms for effective sample synthesis. Using experiments in sample mining and hyper-parameter optimization for supervised learning, we show that our approach consistently outperforms existing exploratory sampling methods in both blind exploration, and sequential search with Bayesian optimization.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes