Bernd Gärtner

LG
5papers
39citations
Novelty57%
AI Score24

5 Papers

OCSep 23, 2016
Screening Rules for Convex Problems

Anant Raj, Jakob Olbrich, Bernd Gärtner et al.

We propose a new framework for deriving screening rules for convex optimization problems. Our approach covers a large class of constrained and penalized optimization formulations, and works in two steps. First, given any approximate point, the structure of the objective function and the duality gap is used to gather information on the optimal solution. In the second step, this information is used to produce screening rules, i.e. safely identifying unimportant weight variables of the optimal solution. Our general framework leads to a large variety of useful existing as well as new screening rules for many applications. For example, we provide new screening rules for general simplex and $L_1$-constrained problems, Elastic Net, squared-loss Support Vector Machines, minimum enclosing ball, as well as structured norm regularized problems, such as group lasso.

LGMay 2, 2016
Algorithms for Learning Sparse Additive Models with Interactions in High Dimensions

Hemant Tyagi, Anastasios Kyrillidis, Bernd Gärtner et al.

A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}φ_{l}(x_l)$ where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $φ$'s, $\mathcal{S}$ to be unknown, there exists extensive work for estimating $f$ from its samples. In this work, we consider a generalized version of SPAMs, that also allows for the presence of a sparse number of second order interaction terms. For some $\mathcal{S}_1 \subset [d], \mathcal{S}_2 \subset {[d] \choose 2}$, with $|\mathcal{S}_1| \ll d, |\mathcal{S}_2| \ll d^2$, the function $f$ is now assumed to be of the form: $\sum_{p \in \mathcal{S}_1}φ_{p} (x_p) + \sum_{(l,l^{\prime}) \in \mathcal{S}_2}φ_{(l,l^{\prime})} (x_l,x_{l^{\prime}})$. Assuming we have the freedom to query $f$ anywhere in its domain, we derive efficient algorithms that provably recover $\mathcal{S}_1,\mathcal{S}_2$ with finite sample bounds. Our analysis covers the noiseless setting where exact samples of $f$ are obtained, and also extends to the noisy setting where the queries are corrupted with noise. For the noisy setting in particular, we consider two noise models namely: i.i.d Gaussian noise and arbitrary but bounded noise. Our main methods for identification of $\mathcal{S}_2$ essentially rely on estimation of sparse Hessian matrices, for which we provide two novel compressed sensing based schemes. Once $\mathcal{S}_1, \mathcal{S}_2$ are known, we show how the individual components $φ_p$, $φ_{(l,l^{\prime})}$ can be estimated via additional queries of $f$, with uniform error bounds. Lastly, we provide simulation results on synthetic data that validate our theoretical findings.

LGApr 18, 2016
Learning Sparse Additive Models with Interactions in High Dimensions

Hemant Tyagi, Anastasios Kyrillidis, Bernd Gärtner et al.

A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is referred to as a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}φ_{l}(x_l)$, where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $φ_l$'s and $\mathcal{S}$ to be unknown, the problem of estimating $f$ from its samples has been studied extensively. In this work, we consider a generalized SPAM, allowing for second order interaction terms. For some $\mathcal{S}_1 \subset [d], \mathcal{S}_2 \subset {[d] \choose 2}$, the function $f$ is assumed to be of the form: $$f(\mathbf{x}) = \sum_{p \in \mathcal{S}_1}φ_{p} (x_p) + \sum_{(l,l^{\prime}) \in \mathcal{S}_2}φ_{(l,l^{\prime})} (x_{l},x_{l^{\prime}}).$$ Assuming $φ_{p},φ_{(l,l^{\prime})}$, $\mathcal{S}_1$ and, $\mathcal{S}_2$ to be unknown, we provide a randomized algorithm that queries $f$ and exactly recovers $\mathcal{S}_1,\mathcal{S}_2$. Consequently, this also enables us to estimate the underlying $φ_p, φ_{(l,l^{\prime})}$. We derive sample complexity bounds for our scheme and also extend our analysis to include the situation where the queries are corrupted with noise -- either stochastic, or arbitrary but bounded. Lastly, we provide simulation results on synthetic data, that validate our theoretical findings.

MLDec 1, 2013
Stochastic continuum armed bandit problem of few linear parameters in high dimensions

Hemant Tyagi, Sebastian Stich, Bernd Gärtner

We consider a stochastic continuum armed bandit problem where the arms are indexed by the $\ell_2$ ball $B_{d}(1+ν)$ of radius $1+ν$ in $\mathbb{R}^d$. The reward functions $r :B_{d}(1+ν) \rightarrow \mathbb{R}$ are considered to intrinsically depend on $k \ll d$ unknown linear parameters so that $r(\mathbf{x}) = g(\mathbf{A} \mathbf{x})$ where $\mathbf{A}$ is a full rank $k \times d$ matrix. Assuming the mean reward function to be smooth we make use of results from low-rank matrix recovery literature and derive an efficient randomized algorithm which achieves a regret bound of $O(C(k,d) n^{\frac{1+k}{2+k}} (\log n)^{\frac{1}{2+k}})$ with high probability. Here $C(k,d)$ is at most polynomial in $d$ and $k$ and $n$ is the number of rounds or the sampling budget which is assumed to be known beforehand.

LGApr 21, 2013
Continuum armed bandit problem of few variables in high dimensions

Hemant Tyagi, Bernd Gärtner

We consider the stochastic and adversarial settings of continuum armed bandits where the arms are indexed by [0,1]^d. The reward functions r:[0,1]^d -> R are assumed to intrinsically depend on at most k coordinate variables implying r(x_1,..,x_d) = g(x_{i_1},..,x_{i_k}) for distinct and unknown i_1,..,i_k from {1,..,d} and some locally Holder continuous g:[0,1]^k -> R with exponent 0 < alpha <= 1. Firstly, assuming (i_1,..,i_k) to be fixed across time, we propose a simple modification of the CAB1 algorithm where we construct the discrete set of sampling points to obtain a bound of O(n^((alpha+k)/(2*alpha+k)) (log n)^((alpha)/(2*alpha+k)) C(k,d)) on the regret, with C(k,d) depending at most polynomially in k and sub-logarithmically in d. The construction is based on creating partitions of {1,..,d} into k disjoint subsets and is probabilistic, hence our result holds with high probability. Secondly we extend our results to also handle the more general case where (i_1,...,i_k) can change over time and derive regret bounds for the same.