Algorithmic Identification of Probabilities
This addresses foundational issues in algorithmic learning theory for researchers in computational statistics and machine learning, offering incremental theoretical extensions to existing identification frameworks.
The paper tackles the problem of identifying a probability measure from an infinite data sequence, showing that if the sequence is i.i.d. and the target belongs to a computably enumerable or co-computably enumerable set, an algorithm can almost surely identify it in the limit using the strong law of large numbers; for dependent sequences typical in the Martin-Löf sense, it provides an algorithm to identify a computable measure using Kolmogorov complexity theory.
TThe problem is to identify a probability associated with a set of natural numbers, given an infinite data sequence of elements from the set. If the given sequence is drawn i.i.d. and the probability mass function involved (the target) belongs to a computably enumerable (c.e.) or co-computably enumerable (co-c.e.) set of computable probability mass functions, then there is an algorithm to almost surely identify the target in the limit. The technical tool is the strong law of large numbers. If the set is finite and the elements of the sequence are dependent while the sequence is typical in the sense of Martin-Löf for at least one measure belonging to a c.e. or co-c.e. set of computable measures, then there is an algorithm to identify in the limit a computable measure for which the sequence is typical (there may be more than one such measure). The technical tool is the theory of Kolmogorov complexity. We give the algorithms and consider the associated predictions.