Adaptive Sequential Machine Learning
This work addresses adaptive sample selection for sequential machine learning problems, but it is incremental as it extends an existing framework with specific applications and validations.
The paper tackles the problem of solving a sequence of stochastic optimization problems in machine learning, such as regression and classification, by extending a framework to adaptively select sample sizes at each time step, ensuring the excess risk does not exceed a target level, with validation through experiments on synthetic and real data.
A framework previously introduced in [3] for solving a sequence of stochastic optimization problems with bounded changes in the minimizers is extended and applied to machine learning problems such as regression and classification. The stochastic optimization problems arising in these machine learning problems is solved using algorithms such as stochastic gradient descent (SGD). A method based on estimates of the change in the minimizers and properties of the optimization algorithm is introduced for adaptively selecting the number of samples at each time step to ensure that the excess risk, i.e., the expected gap between the loss achieved by the approximate minimizer produced by the optimization algorithm and the exact minimizer, does not exceed a target level. A bound is developed to show that the estimate of the change in the minimizers is non-trivial provided that the excess risk is small enough. Extensions relevant to the machine learning setting are considered, including a cost-based approach to select the number of samples with a cost budget over a fixed horizon, and an approach to applying cross-validation for model selection. Finally, experiments with synthetic and real data are used to validate the algorithms.