LGAIROJul 30, 2023

Rating-based Reinforcement Learning

arXiv:2307.16348v217 citationsh-index: 38
Originality Incremental advance
AI Analysis

This work addresses the challenge of human-in-the-loop reinforcement learning for AI systems, offering a novel method that is incremental over existing preference-based and ranking-based paradigms.

The paper tackles the problem of incorporating human guidance into reinforcement learning by introducing a rating-based approach that uses individual trajectory evaluations instead of relative comparisons, achieving improved performance in experiments with synthetic and real human ratings.

This paper develops a novel rating-based reinforcement learning approach that uses human ratings to obtain human guidance in reinforcement learning. Different from the existing preference-based and ranking-based reinforcement learning paradigms, based on human relative preferences over sample pairs, the proposed rating-based reinforcement learning approach is based on human evaluation of individual trajectories without relative comparisons between sample pairs. The rating-based reinforcement learning approach builds on a new prediction model for human ratings and a novel multi-class loss function. We conduct several experimental studies based on synthetic ratings and real human ratings to evaluate the effectiveness and benefits of the new rating-based reinforcement learning approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes