What Can We Do to Improve Peer Review in NLP?
This tackles the issue of declining review quality for NLP researchers and conference organizers, but it is incremental as it discusses potential solutions without presenting new data or methods.
The paper addresses the problem of poorly defined peer review tasks in NLP conferences, which leads to inconsistent evaluations, and argues for the need to create incentives and mechanisms for implementing improvements.
Peer review is our best tool for judging the quality of conference submissions, but it is becoming increasingly spurious. We argue that a part of the problem is that the reviewers and area chairs face a poorly defined task forcing apples-to-oranges comparisons. There are several potential ways forward, but the key difficulty is creating the incentives and mechanisms for their consistent implementation in the NLP community.