Learning with Value-Ramp
This work proposes a novel learning algorithm for agents, but appears incremental as it builds on existing reinforcement learning concepts without specifying a clear problem or target audience.
The authors tackled the problem of developing a new learning principle based on forming ramps to guide agents toward reward peaks, resulting in the Value-Ramp algorithm that is described as natural, easy to configure, and robust with natural numbers.
We study a learning principle based on the intuition of forming ramps. The agent tries to follow an increasing sequence of values until the agent meets a peak of reward. The resulting Value-Ramp algorithm is natural, easy to configure, and has a robust implementation with natural numbers.