On the Privacy-Utility Tradeoff in Peer-Review Data Analysis
This addresses the challenge of enabling research on peer-review improvement by providing a method to release sensitive data with privacy guarantees, though it is incremental as it builds on existing privacy mechanisms.
The paper tackles the problem of releasing peer-review data while protecting reviewer identities by proposing a framework that improves the accuracy of privacy-preserving data releases, achieving theoretical results including a polynomial-time algorithm for enhancing the privacy-utility tradeoff.
A major impediment to research on improving peer review is the unavailability of peer-review data, since any release of such data must grapple with the sensitivity of the peer review data in terms of protecting identities of reviewers from authors. We posit the need to develop techniques to release peer-review data in a privacy-preserving manner. Identifying this problem, in this paper we propose a framework for privacy-preserving release of certain conference peer-review data -- distributions of ratings, miscalibration, and subjectivity -- with an emphasis on the accuracy (or utility) of the released data. The crux of the framework lies in recognizing that a part of the data pertaining to the reviews is already available in public, and we use this information to post-process the data released by any privacy mechanism in a manner that improves the accuracy (utility) of the data while retaining the privacy guarantees. Our framework works with any privacy-preserving mechanism that operates via releasing perturbed data. We present several positive and negative theoretical results, including a polynomial-time algorithm for improving on the privacy-utility tradeoff.