LGAIMar 13, 2025

Conformal Prediction Sets for Deep Generative Models via Reduction to Conformal Regression

arXiv:2503.10512v29 citationsh-index: 18UAI
Originality Highly original
AI Analysis

This addresses the need for reliable uncertainty quantification in generative AI outputs, such as ensuring at least one valid program in code generation, which is an incremental improvement over existing conformal methods.

The paper tackles the problem of generating valid and small prediction sets from deep generative models for applications like code generation, by developing the Generative Prediction Sets (GPS) algorithm that provides provable guarantees. Experiments on code and math word problems with large language models show GPS outperforms state-of-the-art methods.

We consider the problem of generating valid and small prediction sets by sampling outputs (e.g., software code and natural language text) from a black-box deep generative model for a given input (e.g., textual prompt). The validity of a prediction set is determined by a user-defined binary admissibility function depending on the target application. For example, requiring at least one program in the set to pass all test cases in code generation application. To address this problem, we develop a simple and effective conformal inference algorithm referred to as Generative Prediction Sets (GPS). Given a set of calibration examples and black-box access to a deep generative model, GPS can generate prediction sets with provable guarantees. The key insight behind GPS is to exploit the inherent structure within the distribution over the minimum number of samples needed to obtain an admissible output to develop a simple conformal regression approach over the minimum number of samples. Experiments on multiple datasets for code and math word problems using different large language models demonstrate the efficacy of GPS over state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes