AIGTJul 8, 2025

Aligned Textual Scoring Rules

arXiv:2507.06221v11 citationsh-index: 5
Originality Incremental advance
AI Analysis

This addresses the challenge of eliciting truthful and human-aligned text predictions from strategic agents like language models, though it is an incremental improvement over existing proper scoring rule frameworks.

The paper tackles the problem of designing scoring rules for textual information elicitation that are both mathematically proper and aligned with human preferences, by optimizing a proper scoring rule to minimize mean squared error against a reference human score. The proposed Aligned Scoring Rule (ASR) outperforms previous methods in aligning with human preference while maintaining properness.

Scoring rules elicit probabilistic predictions from a strategic agent by scoring the prediction against a ground truth state. A scoring rule is proper if, from the agent's perspective, reporting the true belief maximizes the expected score. With the development of language models, Wu and Hartline (2024) proposes a reduction from textual information elicitation to the numerical (i.e. probabilistic) information elicitation problem, which achieves provable properness for textual elicitation. However, not all proper scoring rules are well aligned with human preference over text. Our paper designs the Aligned Scoring rule (ASR) for text by optimizing and minimizing the mean squared error between a proper scoring rule and a reference score (e.g. human score). Our experiments show that our ASR outperforms previous methods in aligning with human preference while maintaining properness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes