LGMay 10, 2017

GQ($λ$) Quick Reference and Implementation Guide

arXiv:1705.03967v1
Originality Synthesis-oriented
AI Analysis

It serves as a practical resource for researchers or practitioners implementing an existing algorithm, making it incremental in nature.

This document provides a quick reference and implementation guide for linear GQ(λ), a gradient-based off-policy temporal-difference learning algorithm, without presenting new research results or concrete numbers.

This document should serve as a quick reference for and guide to the implementation of linear GQ($λ$), a gradient-based off-policy temporal-difference learning algorithm. Explanation of the intuition and theory behind the algorithm are provided elsewhere (e.g., Maei & Sutton 2010, Maei 2011). If you questions or concerns about the content in this document or the attached java code please email Adam White (adam.white@ualberta.ca). The code is provided as part of the source files in the arXiv submission.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes