Gamma-Nets: Generalizing Value Estimation over Timescale
This provides a compact method for multi-timescale predictions in reinforcement learning, applicable to model-based planning and lifelong learning, but it is incremental as it builds on existing value estimation techniques.
The paper tackles the problem of generalizing value function estimation over arbitrary timescales by introducing Γ-nets, which allow prediction for any timescale by including it as an input, and results show they are effective with only a small accuracy cost compared to fixed-timescale estimators.
We present $Γ$-nets, a method for generalizing value function estimation over timescale. By using the timescale as one of the estimator's inputs we can estimate value for arbitrary timescales. As a result, the prediction target for any timescale is available and we are free to train on multiple timescales at each timestep. Here we empirically evaluate $Γ$-nets in the policy evaluation setting. We first demonstrate the approach on a square wave and then on a robot arm using linear function approximation. Next, we consider the deep reinforcement learning setting using several Atari video games. Our results show that $Γ$-nets can be effective for predicting arbitrary timescales, with only a small cost in accuracy as compared to learning estimators for fixed timescales. $Γ$-nets provide a method for compactly making predictions at many timescales without requiring a priori knowledge of the task, making it a valuable contribution to ongoing work on model-based planning, representation learning, and lifelong learning algorithms.