Learning with Confidence
This work addresses a foundational problem in machine learning and AI by clarifying and formalizing confidence, which is incremental as it builds on existing concepts like Bayes Rule.
The paper formalizes a notion of confidence in learning, distinct from probability, that captures concepts like learning rates and Kalman gain, and provides axiomatic foundations and representations for confidence-based learning.
We characterize a notion of confidence that arises in learning or updating beliefs: the amount of trust one has in incoming information and its impact on the belief state. This learner's confidence can be used alongside (and is easily mistaken for) probability or likelihood, but it is fundamentally a different concept -- one that captures many familiar concepts in the literature, including learning rates and number of training epochs, Shafer's weight of evidence, and Kalman gain. We formally axiomatize what it means to learn with confidence, give two canonical ways of measuring confidence on a continuum, and prove that confidence can always be represented in this way. Under additional assumptions, we derive more compact representations of confidence-based learning in terms of vector fields and loss functions. These representations induce an extended language of compound "parallel" observations. We characterize Bayes Rule as the special case of an optimizing learner whose loss representation is a linear expectation.