NAMar 18
Modified Halley's method for computation of zeros of solution of second order ODEsDhivya Prabhu K, Sanjeev Singh, Antony Vijesh
This paper develops an efficient iterative method for computing all zeros of solutions of second order ordinary differential equations. A third order Halleys method is first derived by approximating the solution of an associated Riccati differential equation. To improve computational efficiency, a modified Halleys method is proposed by fixing one of the functions in Halleys scheme as a constant. The modified Halleys method also retains third order convergence. Based on the behavior of the coefficients of the second order ODE, nonlocal convergence results are established for both Halleys and modified Halleys methods. Suitable initial guesses for computing all zeros of solutions of second order ODEs in a given interval are also presented for both methods. Furthermore, algorithms based on the modified Halleys method are developed for to compute all nodes and weights for Gauss Legendre and Gauss Hermite quadratures. A comparative numerical study with recent methods demonstrates the efficiency of the proposed algorithms.
LGJul 2, 2024
Two-Step Q-LearningAntony Vijesh, Shreyas S R
Q-learning is a stochastic approximation version of the classic value iteration. The literature has established that Q-learning suffers from both maximization bias and slower convergence. Recently, multi-step algorithms have shown practical advantages over existing methods. This paper proposes a novel off-policy two-step Q-learning algorithms, without importance sampling. With suitable assumption it was shown that, iterates in the proposed two-step Q-learning is bounded and converges almost surely to the optimal Q-values. This study also address the convergence analysis of the smooth version of two-step Q-learning, i.e., by replacing max function with the log-sum-exp function. The proposed algorithms are robust and easy to implement. Finally, we test the proposed algorithms on benchmark problems such as the roulette problem, maximization bias problem, and randomly generated Markov decision processes and compare it with the existing methods available in literature. Numerical experiments demonstrate the superior performance of both the two-step Q-learning and its smooth variants.
LGJul 5, 2024
A Multi-Step Minimax Q-learning Algorithm for Two-Player Zero-Sum Markov GamesShreyas S R, Antony Vijesh
An interesting iterative procedure is proposed to solve a two-player zero-sum Markov games. Under suitable assumption, the boundedness of the proposed iterates is obtained theoretically. Using results from stochastic approximation, the almost sure convergence of the proposed two-step minimax Q-learning is obtained theoretically. More specifically, the proposed algorithm converges to the game theoretic optimal value with probability one, when the model information is not known. Numerical simulation authenticate that the proposed algorithm is effective and easy to implement.