CLAIApr 12, 2025

A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future

arXiv:2504.12328v145 citationsh-index: 19Has Code
Originality Synthesis-oriented
AI Analysis

It addresses the need for a systematic introduction to reward models for researchers and practitioners in AI, but it is incremental as it synthesizes existing knowledge without proposing new methods.

This paper provides a comprehensive survey of reward models (RMs), covering their taxonomy, applications, challenges, and future directions, with the result being a structured overview and publicly available resources to aid beginners and facilitate research.

Reward Model (RM) has demonstrated impressive potential for enhancing Large Language Models (LLM), as RM can serve as a proxy for human preferences, providing signals to guide LLMs' behavior in various tasks. In this paper, we provide a comprehensive overview of relevant research, exploring RMs from the perspectives of preference collection, reward modeling, and usage. Next, we introduce the applications of RMs and discuss the benchmarks for evaluation. Furthermore, we conduct an in-depth analysis of the challenges existing in the field and dive into the potential research directions. This paper is dedicated to providing beginners with a comprehensive introduction to RMs and facilitating future studies. The resources are publicly available at github\footnote{https://github.com/JLZhong23/awesome-reward-models}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes