Theory of Machine Learning with Limited Data
It addresses the problem of justifying ML methods with limited data for practitioners, but is incremental as it reinterprets existing learners rather than introducing new ones.
The paper formalizes abduction for real-valued hypotheses and shows that 14 popular ML learners implement this inference, offering an alternative to statistical learning theory which relies on impractical assumptions of indefinitely increasing training sets.
Application of machine learning may be understood as deriving new knowledge for practical use through explaining accumulated observations, training set. Peirce used the term abduction for this kind of inference. Here I formalize the concept of abduction for real valued hypotheses, and show that 14 of the most popular textbook ML learners (every learner I tested), covering classification, regression and clustering, implement this concept of abduction inference. The approach is proposed as an alternative to statistical learning theory, which requires an impractical assumption of indefinitely increasing training set for its justification.