Dylan Klein

AIJul 14, 2021

Mixing Human Demonstrations with Self-Exploration in Experience Replay for Deep Reinforcement Learning

Dylan Klein, Akansel Cosgun

We investigate the effect of using human demonstration data in the replay buffer for Deep Reinforcement Learning. We use a policy gradient method with a modified experience replay buffer where a human demonstration experience is sampled with a given probability. We analyze different ratios of using demonstration data in a task where an agent attempts to reach a goal while avoiding obstacles. Our results suggest that while the agents trained by pure self-exploration and pure demonstration had similar success rates, the pure demonstration model converged faster to solutions with less number of steps.

Dylan Klein

1 Paper