AIGTLGJun 13, 2024

ElicitationGPT: Text Elicitation Mechanisms via Language Models

arXiv:2406.09363v311 citations
Originality Incremental advance
AI Analysis

This addresses the need for automated, provable scoring mechanisms in information elicitation, potentially useful for AI applications, but it is incremental as it builds on existing scoring rule theory with a new application to text.

The paper tackled the problem of scoring elicited text against ground truth by reducing it to forecast elicitation using large language models like ChatGPT, and empirically showed alignment with human preferences on a peer-review dataset compared to manual instructor scores.

Scoring rules evaluate probabilistic forecasts of an unknown state against the realized state and are a fundamental building block in the incentivized elicitation of information. This paper develops mechanisms for scoring elicited text against ground truth text by reducing the textual information elicitation problem to a forecast elicitation problem, via domain-knowledge-free queries to a large language model (specifically ChatGPT), and empirically evaluates their alignment with human preferences. Our theoretical analysis shows that the reduction achieves provable properness via black-box language models. The empirical evaluation is conducted on peer reviews from a peer-grading dataset, in comparison to manual instructor scores for the peer reviews. Our results suggest a paradigm of algorithmic artificial intelligence that may be useful for developing artificial intelligence technologies with provable guarantees.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes