AICVDec 31, 2025

Explicit Abstention Knobs for Predictable Reliability in Video Question Answering

arXiv:2601.00138v21 citationsh-index: 1
Originality Incremental advance
AI Analysis

This addresses the need for predictable reliability in high-stakes VLM deployments, though it is incremental as it builds on existing abstention methods.

The study investigated whether confidence-based abstention in video question answering provides reliable control over error rates, finding that it offers mechanistic control in-distribution but fails under distribution shift, with error rates reduced by up to 50% at high coverage.

High-stakes deployment of vision-language models (VLMs) requires selective prediction, where systems abstain when uncertain rather than risk costly errors. We investigate whether confidence-based abstention provides reliable control over error rates in video question answering, and whether that control remains robust under distribution shift. Using NExT-QA and Gemini 2.0 Flash, we establish two findings. First, confidence thresholding provides mechanistic control in-distribution. Sweeping threshold epsilon produces smooth risk-coverage tradeoffs, reducing error rates f

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes