The Sample Complexity of Parameter-Free Stochastic Convex Optimization
This work addresses the problem of parameter-free optimization for machine learning practitioners, offering incremental improvements in adaptability and sample efficiency.
The paper tackles the sample complexity of stochastic convex optimization when problem parameters are unknown, developing a reliable model selection method that matches optimal known-parameter sample complexity up to log-log factors and a regularization-based method that perfectly adapts to unknown distance to optimality, demonstrating a separation between sample and computational complexity.
We study the sample complexity of stochastic convex optimization when problem parameters, e.g., the distance to optimality, are unknown. We pursue two strategies. First, we develop a reliable model selection method that avoids overfitting the validation set. This method allows us to generically tune the learning rate of stochastic optimization methods to match the optimal known-parameter sample complexity up to $\log\log$ factors. Second, we develop a regularization-based method that is specialized to the case that only the distance to optimality is unknown. This method provides perfect adaptability to unknown distance to optimality, demonstrating a separation between the sample and computational complexity of parameter-free stochastic convex optimization. Combining these two methods allows us to simultaneously adapt to multiple problem structures. Experiments performing few-shot learning on CIFAR-10 by fine-tuning CLIP models and prompt engineering Gemini to count shapes indicate that our reliable model selection method can help mitigate overfitting to small validation sets.