ME LGFeb 3, 2024

Sample-Efficient "Clustering and Conquer" Procedures for Parallel Large-Scale Ranking and Selection

arXiv:2402.02196v52.34 citationsh-index: 2

Originality Incremental advance

AI Analysis

This addresses sample efficiency for large-scale AI applications like neural architecture search, representing an incremental improvement to existing parallel computing frameworks.

This work tackles the problem of sample inefficiency in parallel large-scale ranking and selection by introducing a correlation-based clustering step, achieving an O(p) reduction in sample complexity for a class of sample-optimal procedures.

This work aims to improve the sample efficiency of parallel large-scale ranking and selection (R&S) problems by leveraging correlation information. We modify the commonly used "divide and conquer" framework in parallel computing by adding a correlation-based clustering step, transforming it into "clustering and conquer". Analytical results under a symmetric benchmark scenario show that this seemingly simple modification yields an $\mathcal{O}(p)$ reduction in sample complexity for a widely used class of sample-optimal R&S procedures. Our approach enjoys two key advantages: 1) it does not require highly accurate correlation estimation or precise clustering, and 2) it allows for seamless integration with various existing R&S procedures, while achieving optimal sample complexity. Theoretically, we develop a novel gradient analysis framework to analyze sample efficiency and guide the design of large-scale R&S procedures. We also introduce a new parallel clustering algorithm tailored for large-scale scenarios. Finally, in large-scale AI applications such as neural architecture search, our methods demonstrate superior performance.

View on arXiv PDF

Similar