AP DL GT LG MLAug 24, 2024

The ICML 2023 Ranking Experiment: Examining Author Self-Assessment in ML/AI Peer Review

Buxin Su, Jiayao Zhang, Natalie Collina, Yuling Yan, Didong Li, Kyunghyun Cho, Jianqing Fan, Aaron Roth, Weijie Su

Princeton

arXiv:2408.13430v38.012 citationsh-index: 11Has Code

Originality Synthesis-oriented

AI Analysis

This addresses peer review inefficiencies for ML/AI conferences, but it is incremental as it builds on existing mechanisms.

The study tackled the problem of improving peer review at ML conferences by analyzing author self-assessments, showing that ranking-calibrated scores reduced error in estimating expected review scores compared to raw scores.

We conducted an experiment during the review process of the 2023 International Conference on Machine Learning (ICML), asking authors with multiple submissions to rank their papers based on perceived quality. In total, we received 1,342 rankings, each from a different author, covering 2,592 submissions. In this paper, we present an empirical analysis of how author-provided rankings could be leveraged to improve peer review processes at machine learning conferences. We focus on the Isotonic Mechanism, which calibrates raw review scores using the author-provided rankings. Our analysis shows that these ranking-calibrated scores outperform the raw review scores in estimating the ground truth ``expected review scores'' in terms of both squared and absolute error metrics. Furthermore, we propose several cautious, low-risk applications of the Isotonic Mechanism and author-provided rankings in peer review, including supporting senior area chairs in overseeing area chairs' recommendations, assisting in the selection of paper awards, and guiding the recruitment of emergency reviewers.

View on arXiv PDF Code

Similar