Learning Markov models via low-rank optimization
This addresses the problem of efficient system modeling from limited data for optimization and decision-making, with incremental improvements in estimation methods.
The paper tackles learning a Markov model from a single trajectory when the transition model has low rank, showing that accurate estimation is possible with a trajectory length proportional to the total number of states. It proposes convex and nonconvex maximum likelihood estimators, deriving statistical rates and establishing a minimax lower bound, with empirical results showing the nonconvex estimator's superiority.
Modeling unknown systems from data is a precursor of system optimization and sequential decision making. In this paper, we focus on learning a Markov model from a single trajectory of states. Suppose that the transition model has a small rank despite of having a large state space, meaning that the system admits a low-dimensional latent structure. We show that one can estimate the full transition model accurately using a trajectory of length that is proportional to the total number of states. We propose two maximum likelihood estimation methods: a convex approach with nuclear-norm regularization and a nonconvex approach with rank constraint. We explicitly derive the statistical rates of both estimators in terms of the Kullback-Leiber divergence and the $\ell_2$ error and also establish a minimax lower bound to assess the tightness of these rates. For computing the nonconvex estimator, we develop a novel DC (difference of convex function) programming algorithm that starts with the convex M-estimator and then successively refines the solution till convergence. Empirical experiments demonstrate consistent superiority of the nonconvex estimator over the convex one.