OCAug 21, 2018
Smoothed Hinge Loss and $\ell^{1}$ Support Vector MachinesJeffrey Hajewski, Suely Oliveira, David E. Stewart
A new algorithm is presented for solving the soft-margin Support Vector Machine (SVM) optimization problem with an $\ell^{1}$ penalty. This algorithm is designed to require a modest number of passes over the data, which is an important measure of its cost for very large data sets. The algorithm uses smoothing for the hinge-loss function, and an active set approach for the $\ell^{1}$ penalty.
LGApr 17, 2025
Training Autoencoders Using Stochastic Hessian-Free Optimization with LSMRIbrahim Emirahmetoglu, David E. Stewart
Hessian-free (HF) optimization has been shown to effectively train deep autoencoders (Martens, 2010). In this paper, we aim to accelerate HF training of autoencoders by reducing the amount of data used in training. HF utilizes the conjugate gradient algorithm to estimate update directions. Instead, we propose using the LSMR method, which is known for effectively solving large sparse linear systems. We also incorporate Chapelle & Erhan (2011)'s improved preconditioner for HF optimization. In addition, we introduce a new mini-batch selection algorithm to mitigate overfitting. Our algorithm starts with a small subset of the training data and gradually increases the mini-batch size based on (i) variance estimates obtained during the computation of a mini-batch gradient (Byrd et al., 2012) and (ii) the relative decrease in objective value for the validation data. Our experimental results demonstrate that our stochastic Hessian-free optimization, using the LSMR method and the new sample selection algorithm, leads to rapid training of deep autoencoders with improved generalization error.
OCDec 17, 2023
A Smoothing Algorithm for l1 Support Vector MachinesIbrahim Emirahmetoglu, Jeffrey Hajewski, Suely Oliveira et al.
A smoothing algorithm is presented for solving the soft-margin Support Vector Machine (SVM) optimization problem with an $\ell^{1}$ penalty. This algorithm is designed to require a modest number of passes over the data, which is an important measure of its cost for very large datasets. The algorithm uses smoothing for the hinge-loss function, and an active set approach for the $\ell^{1}$ penalty. The smoothing parameter $α$ is initially large, but typically halved when the smoothed problem is solved to sufficient accuracy. Convergence theory is presented that shows $\mathcal{O}(1+\log(1+\log_+(1/α)))$ guarded Newton steps for each value of $α$ except for asymptotic bands $α=Θ(1)$ and $α=Θ(1/N)$, with only one Newton step provided $ηα\gg1/N$, where $N$ is the number of data points and the stopping criterion that the predicted reduction is less than $ηα$. The experimental results show that our algorithm is capable of strong test accuracy without sacrificing training speed.
NEApr 23, 2020
gBeam-ACO: a greedy and faster variant of Beam-ACOJeff Hajewski, Suely Oliveira, David E. Stewart et al.
Beam-ACO, a modification of the traditional Ant Colony Optimization (ACO) algorithms that incorporates a modified beam search, is one of the most effective ACO algorithms for solving the Traveling Salesman Problem (TSP). Although adding beam search to the ACO heuristic search process is effective, it also increases the amount of work (in terms of partial paths) done by the algorithm at each step. In this work, we introduce a greedy variant of Beam-ACO that uses a greedy path selection heuristic. The exploitation of the greedy path selection is offset by the exploration required in maintaining the beam of paths. This approach has the added benefit of avoiding costly calls to a random number generator and reduces the algorithms internal state, making it simpler to parallelize. Our experiments demonstrate that not only is our greedy Beam-ACO (gBeam-ACO) faster than traditional Beam-ACO, in some cases by an order of magnitude, but it does not sacrifice quality of the found solution, especially on large TSP instances. We also found that our greedy algorithm, which we refer to as gBeam-ACO, was less dependent on hyperparameter settings.