Practical machine learning is learning on small samples
This work addresses the challenge of small sample sizes in machine learning, but it is incremental as it reframes existing methods under a new theoretical perspective.
The paper tackles the problem of machine learning with limited data by proposing that practical learning relies on the assumption of smooth underlying dependencies, formalizing this as the Practical learning paradigm. It shows that popular learners like k-NN and SVM implement this paradigm, but provides no concrete numerical results.
Based on limited observations, machine learning discerns a dependence which is expected to hold in the future. What makes it possible? Statistical learning theory imagines indefinitely increasing training sample to justify its approach. In reality, there is no infinite time or even infinite general population for learning. Here I argue that practical machine learning is based on an implicit assumption that underlying dependence is relatively ``smooth" : likely, there are no abrupt differences in feedback between cases with close data points. From this point of view learning shall involve selection of the hypothesis ``smoothly" approximating the training set. I formalize this as Practical learning paradigm. The paradigm includes terminology and rules for description of learners. Popular learners (local smoothing, k-NN, decision trees, Naive Bayes, SVM for classification and for regression) are shown here to be implementations of this paradigm.