LGMay 10, 2017

GQ($λ$) Quick Reference and Implementation Guide

arXiv:1705.03967v10.7

Originality Synthesis-oriented

AI Analysis

It serves as a practical resource for researchers or practitioners implementing an existing algorithm, making it incremental in nature.

This document provides a quick reference and implementation guide for linear GQ(λ), a gradient-based off-policy temporal-difference learning algorithm, without presenting new research results or concrete numbers.

This document should serve as a quick reference for and guide to the implementation of linear GQ($λ$), a gradient-based off-policy temporal-difference learning algorithm. Explanation of the intuition and theory behind the algorithm are provided elsewhere (e.g., Maei & Sutton 2010, Maei 2011). If you questions or concerns about the content in this document or the attached java code please email Adam White (adam.white@ualberta.ca). The code is provided as part of the source files in the arXiv submission.

View on arXiv PDF

Similar