Continuous Learning: Engineering Super Features With Feature Algebras
This work addresses the challenge of model search and feature engineering in machine learning, presenting an incremental approach to building complex features from simpler ones.
The paper tackles the problem of searching for predictive models by proposing an iterative procedure that generates a sequence of improving models and corresponding non-linear features, which become 2^N-degree polynomials after N iterations and form an associative algebra in the limit, with increasing model likelihood while controlling parameter space dimension.
In this paper we consider a problem of searching a space of predictive models for a given training data set. We propose an iterative procedure for deriving a sequence of improving models and a corresponding sequence of sets of non-linear features on the original input space. After a finite number of iterations N, the non-linear features become 2^N -degree polynomials on the original space. We show that in a limit of an infinite number of iterations derived non-linear features must form an associative algebra: a product of two features is equal to a linear combination of features from the same feature space for any given input point. Because each iteration consists of solving a series of convex problems that contain all previous solutions, the likelihood of the models in the sequence is increasing with each iteration while the dimension of the model parameter space is set to a limited controlled value.