CLAIJan 5

ModeX: Evaluator-Free Best-of-N Selection for Open-Ended Generation

arXiv:2601.02535v12 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the problem of efficient and robust output selection in open-ended generation for LLM users, offering a computationally efficient solution without external evaluators, though it is incremental as it builds on existing Best-of-N and self-consistency methods.

The paper tackles the challenge of selecting a high-quality output from multiple stochastic generations in open-ended tasks for large language models, proposing ModeX, an evaluator-free Best-of-N selection framework that identifies modal outputs based on semantic consensus, and shows consistent performance improvements across tasks like text summarization, code generation, and mathematical reasoning.

Selecting a single high-quality output from multiple stochastic generations remains a fundamental challenge for large language models (LLMs), particularly in open-ended tasks where no canonical answer exists. While Best-of-N and self-consistency methods show that aggregating multiple generations can improve performance, existing approaches typically rely on external evaluators, reward models, or exact string-match voting, limiting their applicability and efficiency. We propose Mode Extraction (ModeX), an evaluator-free Best-of-N selection framework that generalizes majority voting to open-ended text generation by identifying the modal output representing the dominant semantic consensus among generated texts. ModeX constructs a similarity graph over candidate generations and recursively applies spectral clustering to select a representative centroid, without requiring additional inference or auxiliary models. We further instantiate this selection principle as ModeX-Lite, an improved version of ModeX with early pruning for efficiency. Across open-ended tasks -- including text summarization, code generation, and mathematical reasoning -- our approaches consistently outperform standard single- and multi-path baselines, providing a computationally efficient solution for robust open-ended text generation. Code is released in https://github.com/deeplearning-wisc/ModeX.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes