Sparse Learning for Variable Selection with Structures and Nonlinearities
This work tackles the challenge of building efficient and interpretable predictive models for data scientists and practitioners, though it appears incremental as it builds on existing sparse learning concepts.
The thesis addresses the problem of overfitting and high computational costs in predictive modeling by developing machine learning methods for automated variable selection to learn sparse models, which rely on a limited set of input variables to improve interpretability and reduce resource usage.
In this thesis we discuss machine learning methods performing automated variable selection for learning sparse predictive models. There are multiple reasons for promoting sparsity in the predictive models. By relying on a limited set of input variables the models naturally counteract the overfitting problem ubiquitous in learning from finite sets of training points. Sparse models are cheaper to use for predictions, they usually require lower computational resources and by relying on smaller sets of inputs can possibly reduce costs for data collection and storage. Sparse models can also contribute to better understanding of the investigated phenomenons as they are easier to interpret than full models.