Towards Linearization Machine Learning Algorithms
This work addresses prediction accuracy for machine learning practitioners using Spark, though it appears incremental as it builds on existing nearest neighbor and projection concepts.
The paper tackles the problem of improving prediction accuracy in machine learning by proposing a multilinear projection approach that transforms prediction into a consensus problem among nearest neighbors in a transformed space. Results show accuracy improvements over several Spark MLLib algorithms on multiple LIBSVM datasets.
This paper is about a machine learning approach based on the multilinear projection of an unknown function (or probability distribution) to be estimated towards a linear (or multilinear) dimensional space E'. The proposal transforms the problem of predicting the target of an observation x into a problem of determining a consensus among the k nearest neighbors of x's image within the dimensional space E'. The algorithms that concretize it allow both regression and binary classification. Implementations carried out using Scala/Spark and assessed on a dozen LIBSVM datasets have demonstrated improvements in prediction accuracies in comparison with other prediction algorithms implemented within Spark MLLib such as multilayer perceptrons, logistic regression classifiers and random forests.