Lin Zhao

h-index60

3papers

1,158citations

Novelty55%

AI Score31

Ranked #128,960 of 194,257 authors (top 66%)#28,383 in LG (top 71%)

3 Papers

10.6LGSep 29, 2020

Finite-Time Analysis for Double Q-learning

Huaqing Xiong, Lin Zhao, Yingbin Liang et al.

Although Q-learning is one of the most successful algorithms for finding the best action-value function (and thus the optimal policy) in reinforcement learning, its implementation often suffers from large overestimation of Q-function values incurred by random sampling. The double Q-learning algorithm proposed in~\citet{hasselt2010double} overcomes such an overestimation issue by randomly switching the update between two Q-estimators, and has thus gained significant popularity in practice. However, the theoretical understanding of double Q-learning is rather limited. So far only the asymptotic convergence has been established, which does not characterize how fast the algorithm converges. In this paper, we provide the first non-asymptotic (i.e., finite-time) analysis for double Q-learning. We show that both synchronous and asynchronous double Q-learning are guaranteed to converge to an $ε$-accurate neighborhood of the global optimum by taking $\tildeΩ\left(\left( \frac{1}{(1-γ)^6ε^2}\right)^{\frac{1}ω} +\left(\frac{1}{1-γ}\right)^{\frac{1}{1-ω}}\right)$ iterations, where $ω\in(0,1)$ is the decay parameter of the learning rate, and $γ$ is the discount factor. Our analysis develops novel techniques to derive finite-time bounds on the difference between two inter-connected stochastic processes, which is new to the literature of stochastic approximation.

8.5LGJul 30, 2020

Momentum Q-learning with Finite-Sample Convergence Guarantee

Bowen Weng, Huaqing Xiong, Lin Zhao et al.

Existing studies indicate that momentum ideas in conventional optimization can be used to improve the performance of Q-learning algorithms. However, the finite-sample analysis for momentum-based Q-learning algorithms is only available for the tabular case without function approximations. This paper analyzes a class of momentum-based Q-learning algorithms with finite-sample guarantee. Specifically, we propose the MomentumQ algorithm, which integrates the Nesterov's and Polyak's momentum schemes, and generalizes the existing momentum-based Q-learning algorithms. For the infinite state-action space case, we establish the convergence guarantee for MomentumQ with linear function approximations and Markovian sampling. In particular, we characterize the finite-sample convergence rate which is provably faster than the vanilla Q-learning. This is the first finite-sample analysis for momentum-based Q-learning algorithms with function approximations. For the tabular case under synchronous sampling, we also obtain a finite-sample convergence rate that is slightly better than the SpeedyQ \citep{azar2011speedy} when choosing a special family of step sizes. Finally, we demonstrate through various experiments that the proposed MomentumQ outperforms other momentum-based Q-learning algorithms.

32.3CLJun 14, 2018Code

Structure-Infused Copy Mechanisms for Abstractive Summarization

Kaiqiang Song, Lin Zhao, Fei Liu

Seq2seq learning has produced promising results on summarization. However, in many cases, system summaries still struggle to keep the meaning of the original intact. They may miss out important words or relations that play critical roles in the syntactic structure of source sentences. In this paper, we present structure-infused copy mechanisms to facilitate copying important words and relations from the source sentence to summary sentence. The approach naturally combines source dependency structure with the copy mechanism of an abstractive sentence summarizer. Experimental results demonstrate the effectiveness of incorporating source-side syntactic information in the system, and our proposed approach compares favorably to state-of-the-art methods.