AI LGFeb 6

Agentic Uncertainty Reveals Agentic Overconfidence

Jean Kaddour, Srijan Patel, Gbètondji Dovonon, Leo Richter, Pasquale Minervini, Matt J. Kusner

arXiv:2602.06948v17.53 citationsh-index: 15

Originality Incremental advance

AI Analysis

This addresses the issue of unreliable self-assessment in AI agents for users relying on their predictions, though it is incremental in improving calibration methods.

The study tackled the problem of AI agents' ability to predict their own task success, finding that agents exhibit overconfidence, with some succeeding only 22% of the time while predicting 77% success, and adversarial prompting improved calibration.

Can AI agents predict whether they will succeed at a task? We study agentic uncertainty by eliciting success probability estimates before, during, and after task execution. All results exhibit agentic overconfidence: some agents that succeed only 22% of the time predict 77% success. Counterintuitively, pre-execution assessment with strictly less information tends to yield better discrimination than standard post-execution review, though differences are not always significant. Adversarial prompting reframing assessment as bug-finding achieves the best calibration.

View on arXiv PDF

Similar