DTN: A Learning Rate Scheme with Convergence Rate of $\mathcal{O}(1/t)$ for SGD
This is an incremental correction for researchers in optimization, as it retracts flawed convergence guarantees for SGD.
The authors attempted to develop a learning rate scheme for SGD with a convergence rate of O(1/t), but their claims were invalid due to a mathematical error in Lemma 5, leading to incorrect results in multiple theorems and corollaries.
This paper has some inconsistent results, i.e., we made some failed claims because we did some mistakes for using the test criterion for a series. Precisely, our claims on the convergence rate of $\mathcal{O}(1/t)$ of SGD presented in Theorem 1, Corollary 1, Theorem 2 and Corollary 2 are wrongly derived because they are based on Lemma 5. In Lemma 5, we do not correctly use the test criterion for a series. Hence, the result of Lemma 5 is not valid. We would like to thank the community for pointing out this mistake!