LGAIJul 2, 2019

Learning the Arrow of Time

arXiv:1907.01285v16 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of modeling temporal asymmetry for applications like reachability and intrinsic rewards in AI systems, but it appears incremental as it builds on existing theoretical frameworks.

The paper tackles the problem of learning an arrow of time in Markov processes to capture environmental information, showing empirical results on discrete and continuous environments and demonstrating agreement with a known theoretical notion.

We humans seem to have an innate understanding of the asymmetric progression of time, which we use to efficiently and safely perceive and manipulate our environment. Drawing inspiration from that, we address the problem of learning an arrow of time in a Markov (Decision) Process. We illustrate how a learned arrow of time can capture meaningful information about the environment, which in turn can be used to measure reachability, detect side-effects and to obtain an intrinsic reward signal. We show empirical results on a selection of discrete and continuous environments, and demonstrate for a class of stochastic processes that the learned arrow of time agrees reasonably well with a known notion of an arrow of time given by the celebrated Jordan-Kinderlehrer-Otto result.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes