67.0FLMar 17
Solomonoff inductionTom F. Sterkenburg
This chapter discusses the Solomonoff approach to universal prediction. The crucial ingredient in the approach is the notion of computability, and I present the main idea as an attempt to meet two plausible computability desiderata for a universal predictor. This attempt is unsuccessful, which is shown by a generalization of a diagonalization argument due to Putnam. I then critically discuss purported gains of the approach, in particular it providing a foundation for the methodological principle of Occam's razor, and it serving as a theoretical ideal for the development of machine learning methods.
LGDec 21, 2023
Statistical learning theory and Occam's razor: The core argumentTom F. Sterkenburg
Statistical learning theory is often associated with the principle of Occam's razor, which recommends a simplicity preference in inductive inference. This paper distills the core argument for simplicity obtainable from statistical learning theory, built on the theory's central learning guarantee for the method of empirical risk minimization. This core "means-ends" argument is that a simpler hypothesis class or inductive model is better because it has better learning guarantees; however, these guarantees are model-relative and so the theoretical push towards simplicity is checked by our prior knowledge.
LGFeb 10, 2022
On characterizations of learnability with computable learnersTom F. Sterkenburg
We study computable PAC (CPAC) learning as introduced by Agarwal et al. (2020). First, we consider the main open question of finding characterizations of proper and improper CPAC learning. We give a characterization of a closely related notion of strong CPAC learning, and provide a negative answer to the COLT open problem posed by Agarwal et al. (2021) whether all decidably representable VC classes are improperly CPAC learnable. Second, we consider undecidability of (computable) PAC learnability. We give a simple general argument to exhibit such ndecidability, and initiate a study of the arithmetical complexity of learnability. We briefly discuss the relation to the undecidability result of Ben-David et al. (2019), that motivated the work of Agarwal et al.
LGFeb 9, 2022
The no-free-lunch theorems of supervised learningTom F. Sterkenburg, Peter D. Grünwald
The no-free-lunch theorems promote a skeptical conclusion that all possible machine learning algorithms equally lack justification. But how could this leave room for a learning theory, that shows that some algorithms are better than others? Drawing parallels to the philosophy of induction, we point out that the no-free-lunch results presuppose a conception of learning algorithms as purely data-driven. On this conception, every algorithm must have an inherent inductive bias, that wants justification. We argue that many standard learning algorithms should rather be understood as model-dependent: in each application they also require for input a model, representing a bias. Generic algorithms themselves, they can be given a model-relative justification.