DSLGSTJun 8, 2023

A Cover Time Study of a non-Markovian Algorithm

arXiv:2306.04902v2h-index: 40
Originality Incremental advance
AI Analysis

This work provides theoretical insights into non-Markovian exploration methods, with implications for reinforcement learning algorithms like UCB and MCTS, though it is incremental as it extends cover time analysis to a new class of methods.

The paper tackles the problem of analyzing cover time for non-Markovian traversal algorithms, showing that a negative feedback strategy outperforms naive random walks by improving search efficiency locally for arbitrary graphs and achieving smaller cover times for specific graph types like cliques and trees.

Given a traversal algorithm, cover time is the expected number of steps needed to visit all nodes in a given graph. A smaller cover time means a higher exploration efficiency of traversal algorithm. Although random walk algorithms have been studied extensively in the existing literature, there has been no cover time result for any non-Markovian method. In this work, we stand on a theoretical perspective and show that the negative feedback strategy (a count-based exploration method) is better than the naive random walk search. In particular, the former strategy can locally improve the search efficiency for an arbitrary graph. It also achieves smaller cover times for special but important graphs, including clique graphs, tree graphs, etc. Moreover, we make connections between our results and reinforcement learning literature to give new insights on why classical UCB and MCTS algorithms are so useful. Various numerical results corroborate our theoretical findings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes