AIMar 19, 2017

Multi-Timescale, Gradient Descent, Temporal Difference Learning with Linear Options

arXiv:1703.06471v1
Originality Incremental advance
AI Analysis

This work addresses the problem of real-time deliberation for agents in reinforcement learning, though it appears incremental in combining existing temporal abstraction with multi-timescale options.

The paper tackles the challenge of efficient planning in reinforcement learning with large or continuous state spaces by proposing a convergent algorithm that uses temporal abstraction and randomly generated option models over multiple timescales, resulting in a reduction in the number of decision epochs needed to solve tasks.

Deliberating on large or continuous state spaces have been long standing challenges in reinforcement learning. Temporal Abstraction have somewhat made this possible, but efficiently planing using temporal abstraction still remains an issue. Moreover using spatial abstractions to learn policies for various situations at once while using temporal abstraction models is an open problem. We propose here an efficient algorithm which is convergent under linear function approximation while planning using temporally abstract actions. We show how this algorithm can be used along with randomly generated option models over multiple time scales to plan agents which need to act real time. Using these randomly generated option models over multiple time scales are shown to reduce number of decision epochs required to solve the given task, hence effectively reducing the time needed for deliberation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes