Random Bits Regression: a Strong General Predictor for Big Data
This provides a fast and robust predictor for big data applications, though it appears incremental as it builds on existing regression techniques with a novel feature generation approach.
The authors tackled the problem of improving accuracy and speed in regression and classification by proposing Random Bits Regression (RBR), which generates random binary features and applies regularized regression, showing it outperforms other methods in benchmark analyses on simulated, UCI, and GWAS datasets.
To improve accuracy and speed of regressions and classifications, we present a data-based prediction method, Random Bits Regression (RBR). This method first generates a large number of random binary intermediate/derived features based on the original input matrix, and then performs regularized linear/logistic regression on those intermediate/derived features to predict the outcome. Benchmark analyses on a simulated dataset, UCI machine learning repository datasets and a GWAS dataset showed that RBR outperforms other popular methods in accuracy and robustness. RBR (available on https://sourceforge.net/projects/rbr/) is very fast and requires reasonable memories, therefore, provides a strong, robust and fast predictor in the big data era.