LG DLJun 5, 2023

Has the Machine Learning Review Process Become More Arbitrary as the Field Has Grown? The NeurIPS 2021 Consistency Experiment

Alina Beygelzimer, Yann N. Dauphin, Percy Liang, Jennifer Wortman Vaughan

Microsoft

arXiv:2306.03262v118.427 citationsh-index: 102

Originality Synthesis-oriented

AI Analysis

This research highlights the inherent arbitrariness in peer review for machine learning conferences, which is an incremental analysis building on prior experiments.

The study quantified randomness in the NeurIPS 2021 review process by having two independent committees review 10% of submissions, finding that they disagreed on accept/reject recommendations for 23% of papers and about half of accepted papers would change if the process were rerun.

We present the NeurIPS 2021 consistency experiment, a larger-scale variant of the 2014 NeurIPS experiment in which 10% of conference submissions were reviewed by two independent committees to quantify the randomness in the review process. We observe that the two committees disagree on their accept/reject recommendations for 23% of the papers and that, consistent with the results from 2014, approximately half of the list of accepted papers would change if the review process were randomly rerun. Our analysis suggests that making the conference more selective would increase the arbitrariness of the process. Taken together with previous research, our results highlight the inherent difficulty of objectively measuring the quality of research, and suggest that authors should not be excessively discouraged by rejected work.

View on arXiv PDF

Similar