Optimal Stochastic Nonconvex Optimization with Bandit Feedback
This addresses a theoretical challenge in stochastic optimization with bandit feedback for nonconvex settings, representing an incremental improvement over existing methods.
The paper tackles the problem of continuous armed bandit optimization for nonconvex cost functions by proposing an adaptive bin splitting method, which achieves locally minimax optimal expected cumulative regret as proven by derived bounds.
In this paper, we analyze the continuous armed bandit problems for nonconvex cost functions under certain smoothness and sublevel set assumptions. We first derive an upper bound on the expected cumulative regret of a simple bin splitting method. We then propose an adaptive bin splitting method, which can significantly improve the performance. Furthermore, a minimax lower bound is derived, which shows that our new adaptive method achieves locally minimax optimal expected cumulative regret.