Experience enrichment based task independent reward model
This addresses the challenge of emergent rewards in complex real-world interactions for reinforcement learning agents, but appears incremental as it builds on existing reward modeling concepts.
The paper tackles the problem of manually defined task-specific rewards in reinforcement learning by proposing an implicit generic reward model that is task-independent and derived from deviations from agents' previous experiences, with no concrete numbers provided.
For most reinforcement learning approaches, the learning is performed by maximizing an accumulative reward that is expectedly and manually defined for specific tasks. However, in real world, rewards are emergent phenomena from the complex interactions between agents and environments. In this paper, we propose an implicit generic reward model for reinforcement learning. Unlike those rewards that are manually defined for specific tasks, such implicit reward is task independent. It only comes from the deviation from the agents' previous experiences.