Adarsh Barik

LG
h-index8
18papers
75citations
Novelty55%
AI Score46

18 Papers

OCJul 10, 2023
Invex Programs: First Order Algorithms and Their Convergence

Adarsh Barik, Suvrit Sra, Jean Honorio

Invex programs are a special kind of non-convex problems which attain global minima at every stationary point. While classical first-order gradient descent methods can solve them, they converge very slowly. In this paper, we propose new first-order algorithms to solve the general class of invex problems. We identify sufficient conditions for convergence of our algorithms and provide rates of convergence. Furthermore, we go beyond unconstrained problems and provide a novel projected gradient method for constrained invex programs with convergence rate guarantees. We compare and contrast our results with existing first-order algorithms for a variety of unconstrained and constrained invex problems. To the best of our knowledge, our proposed algorithm is the first algorithm to solve constrained invex programs.

LGJun 2, 2022
Sparse Mixed Linear Regression with Guarantees: Taming an Intractable Problem with Invex Relaxation

Adarsh Barik, Jean Honorio

In this paper, we study the problem of sparse mixed linear regression on an unlabeled dataset that is generated from linear measurements from two different regression parameter vectors. Since the data is unlabeled, our task is not only to figure out a good approximation of the regression parameter vectors but also to label the dataset correctly. In its original form, this problem is NP-hard. The most popular algorithms to solve this problem (such as Expectation-Maximization) have a tendency to stuck at local minima. We provide a novel invex relaxation for this intractable problem which leads to a solution with provable theoretical guarantees. This relaxation enables exact recovery of data labels. Furthermore, we recover a close approximation of the regression parameter vectors which match the true parameter vectors in support and sign. Our formulation uses a carefully constructed primal dual witnesses framework for the invex problem. Furthermore, we show that the sample complexity of our method is only logarithmic in terms of the dimension of the regression parameter vectors.

LGJun 22, 2023
Outlier-robust Estimation of a Sparse Linear Model Using Invexity

Adarsh Barik, Jean Honorio

In this paper, we study problem of estimating a sparse regression vector with correct support in the presence of outlier samples. The inconsistency of lasso-type methods is well known in this scenario. We propose a combinatorial version of outlier-robust lasso which also identifies clean samples. Subsequently, we use these clean samples to make a good estimation. We also provide a novel invex relaxation for the combinatorial problem and provide provable theoretical guarantees for this relaxation. Finally, we conduct experiments to validate our theory and compare our results against standard lasso.

LGMay 21
Bandit Convex Optimization with Gradient Prediction Adaptivity

Shuche Wang, Adarsh Barik, Vincent Y. F. Tan

Bandit convex optimization (BCO) is a fundamental online learning framework with partial feedback, where the learner observes only the loss incurred at the chosen decision point in each round. In this work, we investigate whether optimistic gradient predictions can improve worst-case regret guarantees in a prediction-adaptive manner. Specifically, given gradient predictions $m_t$, we seek regret bounds that scale with the cumulative prediction error $S_T=\sum_{t=1}^T \|\nabla f_t(x_t)-m_t\|^2.$ We first establish a negative result: under the single-point feedback protocol, an unavoidable $Ω(\sqrt{T})$ regret lower bound persists even when $S_T=o(T)$, showing that the variance of gradient estimation fundamentally obscures the benefit of accurate predictions. To overcome this barrier, we propose \emph{Two-Point Variance-Reduced Optimistic Gradient Descent} (TP-VR-OPT) for the two-point feedback setting. The key idea is a novel variance-reduced gradient estimator whose variance scales with the prediction error rather than the gradient norm. This yields a regret bound of $O\big(\sqrt{d\,\mathbb{E}[S_T]}\big),$ where $d$ is the decision dimension. Complementing this result, we establish an information-theoretic lower bound that scales as $Ω(\sqrt{\mathbb{E}[S_T]})$, providing a fundamental characterization of the best achievable prediction-adaptive regret and showing that TP-VR-OPT is optimal up to a factor of $\sqrt d$. We further develop adaptive variants that eliminate the need for prior knowledge of $\mathbb{E}[S_T]$ or the horizon $T$, and extend our framework to non-stationary environments, establishing dynamic regret guarantees that adapt simultaneously to the cumulative prediction error and the comparator path length.

LGAug 19, 2022
A Novel Plug-and-Play Approach for Adversarially Robust Generalization

Deepak Maurya, Adarsh Barik, Jean Honorio

In this work, we propose a robust framework that employs adversarially robust training to safeguard the ML models against perturbed testing data. Our contributions can be seen from both computational and statistical perspectives. Firstly, from a computational/optimization point of view, we derive the ready-to-use exact solution for several widely used loss functions with a variety of norm constraints on adversarial perturbation for various supervised and unsupervised ML problems, including regression, classification, two-layer neural networks, graphical models, and matrix completion. The solutions are either in closed-form, or an easily tractable optimization problem such as 1-D convex optimization, semidefinite programming, difference of convex programming or a sorting-based algorithm. Secondly, from statistical/generalization viewpoint, using some of these results, we derive novel bounds of the adversarial Rademacher complexity for various problems, which entails new generalization bounds. Thirdly, we perform some sanity-check experiments on real-world datasets for supervised problems such as regression and classification, as well as for unsupervised problems such as matrix completion and learning graphical models, with very little computational overhead.

LGSep 7, 2024
A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval

Adarsh Barik, Anand Krishna, Vincent Y. F. Tan

In this work, we study the robust phase retrieval problem where the task is to recover an unknown signal $θ^* \in \mathbb{R}^d$ in the presence of potentially arbitrarily corrupted magnitude-only linear measurements. We propose an alternating minimization approach that incorporates an oracle solver for a non-convex optimization problem as a subroutine. Our algorithm guarantees convergence to $θ^*$ and provides an explicit polynomial dependence of the convergence rate on the fraction of corrupted measurements. We then provide an efficient construction of the aforementioned oracle under a sparse arbitrary outliers model and offer valuable insights into the geometric properties of the loss landscape in phase retrieval with corrupted measurements. Our proposed oracle avoids the need for computationally intensive spectral initialization, using a simple gradient descent algorithm with a constant step size and random initialization instead. Additionally, our overall algorithm achieves nearly linear sample complexity, $\mathcal{O}(d \, \mathrm{polylog}(d))$.

LGAug 12, 2024
LEARN: An Invex Loss for Outlier Oblivious Robust Online Optimization

Adarsh Barik, Anand Krishna, Vincent Y. F. Tan

We study a robust online convex optimization framework, where an adversary can introduce outliers by corrupting loss functions in an arbitrary number of rounds k, unknown to the learner. Our focus is on a novel setting allowing unbounded domains and large gradients for the losses without relying on a Lipschitz assumption. We introduce the Log Exponential Adjusted Robust and iNvex (LEARN) loss, a non-convex (invex) robust loss function to mitigate the effects of outliers and develop a robust variant of the online gradient descent algorithm by leveraging the LEARN loss. We establish tight regret guarantees (up to constants), in a dynamic setting, with respect to the uncorrupted rounds and conduct experiments to validate our theory. Furthermore, we present a unified analysis framework for developing online optimization algorithms for non-convex (invex) losses, utilizing it to provide regret bounds with respect to the LEARN loss, which may be of independent interest.

LGDec 14, 2024
p-Mean Regret for Stochastic Bandits

Anand Krishna, Philips George John, Adarsh Barik et al.

In this work, we extend the concept of the $p$-mean welfare objective from social choice theory (Moulin 2004) to study $p$-mean regret in stochastic multi-armed bandit problems. The $p$-mean regret, defined as the difference between the optimal mean among the arms and the $p$-mean of the expected rewards, offers a flexible framework for evaluating bandit algorithms, enabling algorithm designers to balance fairness and efficiency by adjusting the parameter $p$. Our framework encompasses both average cumulative regret and Nash regret as special cases. We introduce a simple, unified UCB-based algorithm (Explore-Then-UCB) that achieves novel $p$-mean regret bounds. Our algorithm consists of two phases: a carefully calibrated uniform exploration phase to initialize sample means, followed by the UCB1 algorithm of Auer, Cesa-Bianchi, and Fischer (2002). Under mild assumptions, we prove that our algorithm achieves a $p$-mean regret bound of $\tilde{O}\left(\sqrt{\frac{k}{T^{\frac{1}{2|p|}}}}\right)$ for all $p \leq -1$, where $k$ represents the number of arms and $T$ the time horizon. When $-1<p<0$, we achieve a regret bound of $\tilde{O}\left(\sqrt{\frac{k^{1.5}}{T^{\frac{1}{2}}}}\right)$. For the range $0< p \leq 1$, we achieve a $p$-mean regret scaling as $\tilde{O}\left(\sqrt{\frac{k}{T}}\right)$, which matches the previously established lower bound up to logarithmic factors (Auer et al. 1995). This result stems from the fact that the $p$-mean regret of any algorithm is at least its average cumulative regret for $p \leq 1$. In the case of Nash regret (the limit as $p$ approaches zero), our unified approach differs from prior work (Barman et al. 2023), which requires a new Nash Confidence Bound algorithm. Notably, we achieve the same regret bound up to constant factors using our more general method.

LGOct 6, 2025
Parameter-free Algorithms for the Stochastically Extended Adversarial Model

Shuche Wang, Adarsh Barik, Peng Zhao et al.

We develop the first parameter-free algorithms for the Stochastically Extended Adversarial (SEA) model, a framework that bridges adversarial and stochastic online convex optimization. Existing approaches for the SEA model require prior knowledge of problem-specific parameters, such as the diameter of the domain $D$ and the Lipschitz constant of the loss functions $G$, which limits their practical applicability. Addressing this, we develop parameter-free methods by leveraging the Optimistic Online Newton Step (OONS) algorithm to eliminate the need for these parameters. We first establish a comparator-adaptive algorithm for the scenario with unknown domain diameter but known Lipschitz constant, achieving an expected regret bound of $\tilde{O}\big(\|u\|_2^2 + \|u\|_2(\sqrt{σ^2_{1:T}} + \sqrt{Σ^2_{1:T}})\big)$, where $u$ is the comparator vector and $σ^2_{1:T}$ and $Σ^2_{1:T}$ represent the cumulative stochastic variance and cumulative adversarial variation, respectively. We then extend this to the more general setting where both $D$ and $G$ are unknown, attaining the comparator- and Lipschitz-adaptive algorithm. Notably, the regret bound exhibits the same dependence on $σ^2_{1:T}$ and $Σ^2_{1:T}$, demonstrating the efficacy of our proposed methods even when both parameters are unknown in the SEA model.

NAFeb 19, 2021
Information-Theoretic Bounds for Integral Estimation

Donald Q. Adams, Adarsh Barik, Jean Honorio

In this paper, we consider a zero-order stochastic oracle model of estimating definite integrals. In this model, integral estimation methods may query an oracle function for a fixed number of noisy values of the integrand function and use these values to produce an estimate of the integral. We first show that the information-theoretic error lower bound for estimating the integral of a $d$-dimensional function over a region with $l_\infty$ radius $r$ using at most $T$ queries to the oracle function is $Ω(2^d r^{d+1}\sqrt{d/T})$. Additionally, we find that the Gaussian Quadrature method under the same model achieves a rate of $O(2^{d}r^d/\sqrt{T})$ for functions with zero fourth and higher-order derivatives with respect to individual dimensions, and for Gaussian oracles, this rate is tight. For functions with nonzero fourth derivatives, the Gaussian Quadrature method achieves an upper bound which is not tight with the information-theoretic lower bound. Therefore, it is not minimax optimal, so there is space for the development of better integral estimation methods for such functions.

LGFeb 19, 2021
Fair Sparse Regression with Clustering: An Invex Relaxation for a Combinatorial Problem

Adarsh Barik, Jean Honorio

In this paper, we study the problem of fair sparse regression on a biased dataset where bias depends upon a hidden binary attribute. The presence of a hidden attribute adds an extra layer of complexity to the problem by combining sparse regression and clustering with unknown binary labels. The corresponding optimization problem is combinatorial, but we propose a novel relaxation of it as an \emph{invex} optimization problem. To the best of our knowledge, this is the first invex relaxation for a combinatorial problem. We show that the inclusion of the debiasing/fairness constraint in our model has no adverse effect on the performance. Rather, it enables the recovery of the hidden attribute. The support of our recovered regression parameter vector matches exactly with the true parameter vector. Moreover, we simultaneously solve the clustering problem by recovering the exact value of the hidden attribute for each sample. Our method uses carefully constructed primal dual witnesses to provide theoretical guarantees for the combinatorial problem. To that end, we show that the sample complexity of our method is logarithmic in terms of the dimension of the regression parameter vector.

LGFeb 18, 2021
A Simple Unified Framework for High Dimensional Bandit Problems

Wenjie Li, Adarsh Barik, Jean Honorio

Stochastic high dimensional bandit problems with low dimensional structures are useful in different applications such as online advertising and drug discovery. In this work, we propose a simple unified algorithm for such problems and present a general analysis framework for the regret upper bound of our algorithm. We show that under some mild unified assumptions, our algorithm can be applied to different high dimensional bandit problems. Our framework utilizes the low dimensional structure to guide the parameter estimation in the problem, therefore our algorithm achieves the comparable regret bounds in the LASSO bandit, as well as novel bounds in the low-rank matrix bandit, the group sparse matrix bandit, and in a new problem: the multi-agent LASSO bandit.

LGJun 22, 2020
Exact Support Recovery in Federated Regression with One-shot Communication

Adarsh Barik, Jean Honorio

Federated learning provides a framework to address the challenges of distributed computing, data ownership and privacy over a large number of distributed clients with low computational and communication capabilities. In this paper, we study the problem of learning the exact support of sparse linear regression in the federated learning setup. We provide a simple communication efficient algorithm which only needs one-shot communication with the centralized server to compute the exact support. Our method does not require the clients to solve any optimization problem and thus, can be run on devices with low computational capabilities. Our method is naturally robust to the problems of client failure, model poisoning and straggling clients. We formally prove that our method requires a number of samples per client that is polynomial with respect to the support size, but independent of the dimension of the problem. We require the number of distributed clients to be logarithmic in the dimension of the problem. If the predictor variables are mutually independent then the overall sample complexity matches the optimal sample complexity of the non-federated centralized setting. Furthermore, our method is easy to implement and has an overall polynomial time complexity.

GTApr 1, 2020
Provable Sample Complexity Guarantees for Learning of Continuous-Action Graphical Games with Nonparametric Utilities

Adarsh Barik, Jean Honorio

In this paper, we study the problem of learning the exact structure of continuous-action games with non-parametric utility functions. We propose an $\ell_1$ regularized method which encourages sparsity of the coefficients of the Fourier transform of the recovered utilities. Our method works by accessing very few Nash equilibria and their noisy utilities. Under certain technical conditions, our method also recovers the exact structure of these utility functions, and thus, the exact structure of the game. Furthermore, our method only needs a logarithmic number of samples in terms of the number of players and runs in polynomial time. We follow the primal-dual witness framework to provide provable theoretical guarantees.

GTNov 8, 2019
Provable Computational and Statistical Guarantees for Efficient Learning of Continuous-Action Graphical Games

Adarsh Barik, Jean Honorio

In this paper, we study the problem of learning the set of pure strategy Nash equilibria and the exact structure of a continuous-action graphical game with quadratic payoffs by observing a small set of perturbed equilibria. A continuous-action graphical game can possibly have an uncountable set of Nash euqilibria. We propose a $\ell_{12}-$ block regularized method which recovers a graphical game, whose Nash equilibria are the $ε$-Nash equilibria of the game from which the data was generated (true game). Under a slightly stringent condition on the parameters of the true game, our method recovers the exact structure of the graphical game. Our method has a logarithmic sample complexity with respect to the number of players. It also runs in polynomial time.

LGMay 29, 2019
Learning Bayesian Networks with Low Rank Conditional Probability Tables

Adarsh Barik, Jean Honorio

In this paper, we provide a method to learn the directed structure of a Bayesian network using data. The data is accessed by making conditional probability queries to a black-box model. We introduce a notion of simplicity of representation of conditional probability tables for the nodes in the Bayesian network, that we call "low rankness". We connect this notion to the Fourier transformation of real valued set functions and propose a method which learns the exact directed structure of a `low rank` Bayesian network using very few queries. We formally prove that our method correctly recovers the true directed structure, runs in polynomial time and only needs polynomial samples with respect to the number of nodes. We also provide further improvements in efficiency if we have access to some observational data.

LGMar 12, 2018
Learning discrete Bayesian networks in polynomial time and sample complexity

Adarsh Barik, Jean Honorio

In this paper, we study the problem of structure learning for Bayesian networks in which nodes take discrete values. The problem is NP-hard in general but we show that under certain conditions we can recover the true structure of a Bayesian network with sufficient number of samples. We develop a mathematical model which does not assume any specific conditional probability distributions for the nodes. We use a primal-dual witness construction to prove that, under some technical conditions on the interaction between node pairs, we can do exact recovery of the parents and children of a node by performing group l_12-regularized multivariate regression. Thus, we recover the true Bayesian network structure. If degree of a node is bounded then the sample complexity of our proposed approach grows logarithmically with respect to the number of nodes in the Bayesian network. Furthermore, our method runs in polynomial time.

LGJan 26, 2017
Information Theoretic Limits for Linear Prediction with Graph-Structured Sparsity

Adarsh Barik, Jean Honorio, Mohit Tawarmalani

We analyze the necessary number of samples for sparse vector recovery in a noisy linear prediction setup. This model includes problems such as linear regression and classification. We focus on structured graph models. In particular, we prove that sufficient number of samples for the weighted graph model proposed by Hegde and others is also necessary. We use the Fano's inequality on well constructed ensembles as our main tool in establishing information theoretic lower bounds.