A Tighter Convergence Proof of Reverse Experience Replay
This work addresses a theoretical gap for researchers in reinforcement learning, but it is incremental as it builds on existing RER methods.
The paper tackles the theoretical limitations of Reverse Experience Replay (RER) in reinforcement learning, which previously required minimal learning rates and short steps, by providing a tighter analysis that allows for larger learning rates and longer sequences, showing improved convergence.
In reinforcement learning, Reverse Experience Replay (RER) is a recently proposed algorithm that attains better sample complexity than the classic experience replay method. RER requires the learning algorithm to update the parameters through consecutive state-action-reward tuples in reverse order. However, the most recent theoretical analysis only holds for a minimal learning rate and short consecutive steps, which converge slower than those large learning rate algorithms without RER. In view of this theoretical and empirical gap, we provide a tighter analysis that mitigates the limitation on the learning rate and the length of consecutive steps. Furthermore, we show theoretically that RER converges with a larger learning rate and a longer sequence.