MLLGMay 3, 2021

Learning Good State and Action Representations via Tensor Decomposition

arXiv:2105.01136v28 citations
Originality Incremental advance
AI Analysis

This work addresses representation learning for MDPs, which is crucial for improving efficiency in reinforcement learning, though it appears incremental as it builds on existing tensor methods.

The paper tackles the problem of learning meaningful low-dimensional state and action representations in continuous-state-action Markov decision processes (MDPs) by proposing an unsupervised tensor decomposition method, resulting in sharp statistical error bounds and accurate approximations to latent block structures for downstream tasks like policy evaluation.

The transition kernel of a continuous-state-action Markov decision process (MDP) admits a natural tensor structure. This paper proposes a tensor-inspired unsupervised learning method to identify meaningful low-dimensional state and action representations from empirical trajectories. The method exploits the MDP's tensor structure by kernelization, importance sampling and low-Tucker-rank approximation. This method can be further used to cluster states and actions respectively and find the best discrete MDP abstraction. We provide sharp statistical error bounds for tensor concentration and the preservation of diffusion distance after embedding. We further prove that the learned state/action abstractions provide accurate approximations to latent block structures if they exist, enabling function approximation in downstream tasks such as policy evaluation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes