Convergence Analysis of Fractional Gradient Descent
It addresses the limited theoretical understanding of fractional gradient descent for optimization, which is incremental in extending analysis to new settings.
This paper analyzes the convergence properties of fractional gradient descent in smooth convex, strongly convex, and non-convex settings, proving linear convergence for strongly convex functions and O(1/T) rates for convex and non-convex cases, with empirical results showing potential speed-ups over standard gradient descent.
Fractional derivatives are a well-studied generalization of integer order derivatives. Naturally, for optimization, it is of interest to understand the convergence properties of gradient descent using fractional derivatives. Convergence analysis of fractional gradient descent is currently limited both in the methods analyzed and the settings analyzed. This paper aims to fill in these gaps by analyzing variations of fractional gradient descent in smooth and convex, smooth and strongly convex, and smooth and non-convex settings. First, novel bounds will be established bridging fractional and integer derivatives. Then, these bounds will be applied to the aforementioned settings to prove linear convergence for smooth and strongly convex functions and $O(1/T)$ convergence for smooth and convex functions. Additionally, we prove $O(1/T)$ convergence for smooth and non-convex functions using an extended notion of smoothness - Hölder smoothness - that is more natural for fractional derivatives. Finally, empirical results will be presented on the potential speed up of fractional gradient descent over standard gradient descent as well as some preliminary theoretical results explaining this speed up.