Shuoguang Yang

LG
11papers
138citations
Novelty58%
AI Score28

11 Papers

MLJun 22, 2022
Decentralized Gossip-Based Stochastic Bilevel Optimization over Communication Networks

Shuoguang Yang, Xuezhou Zhang, Mengdi Wang

Bilevel optimization have gained growing interests, with numerous applications found in meta learning, minimax games, reinforcement learning, and nested composition optimization. This paper studies the problem of distributed bilevel optimization over a network where agents can only communicate with neighbors, including examples from multi-task, multi-agent learning and federated learning. In this paper, we propose a gossip-based distributed bilevel learning algorithm that allows networked agents to solve both the inner and outer optimization problems in a single timescale and share information via network propagation. We show that our algorithm enjoys the $\mathcal{O}(\frac{1}{K ε^2})$ per-agent sample complexity for general nonconvex bilevel optimization and $\mathcal{O}(\frac{1}{K ε})$ for strongly convex objective, achieving a speedup that scales linearly with the network size. The sample complexities are optimal in both $ε$ and $K$. We test our algorithm on the examples of hyperparameter tuning and decentralized reinforcement learning. Simulated experiments confirmed that our algorithm achieves the state-of-the-art training efficiency and test accuracy.

MLNov 11, 2022
Online Linearized LASSO

Shuoguang Yang, Yuhao Yan, Xiuneng Zhu et al.

Sparse regression has been a popular approach to perform variable selection and enhance the prediction accuracy and interpretability of the resulting statistical model. Existing approaches focus on offline regularized regression, while the online scenario has rarely been studied. In this paper, we propose a novel online sparse linear regression framework for analyzing streaming data when data points arrive sequentially. Our proposed method is memory efficient and requires less stringent restricted strong convexity assumptions. Theoretically, we show that with a properly chosen regularization parameter, the $\ell_2$-norm statistical error of our estimator diminishes to zero in the optimal order of $\tilde{O}({\sqrt{s/t}})$, where $s$ is the sparsity level, $t$ is the streaming sample size, and $\tilde{O}(\cdot)$ hides logarithmic terms. Numerical experiments demonstrate the practical efficiency of our algorithm.

MLMay 24, 2022
Information-Directed Selection for Top-Two Algorithms

Wei You, Chao Qin, Zihao Wang et al.

We consider the best-k-arm identification problem for multi-armed bandits, where the objective is to select the exact set of k arms with the highest mean rewards by sequentially allocating measurement effort. We characterize the necessary and sufficient conditions for the optimal allocation using dual variables. Remarkably these optimality conditions lead to the extension of top-two algorithm design principle (Russo, 2020), initially proposed for best-arm identification. Furthermore, our optimality conditions induce a simple and effective selection rule dubbed information-directed selection (IDS) that selects one of the top-two candidates based on a measure of information gain. As a theoretical guarantee, we prove that integrated with IDS, top-two Thompson sampling is (asymptotically) optimal for Gaussian best-arm identification, solving a glaring open problem in the pure exploration literature (Russo, 2020). As a by-product, we show that for k > 1, top-two algorithms cannot achieve optimality even when the algorithm has access to the unknown "optimal" tuning parameter. Numerical experiments show the superior performance of the proposed top-two algorithms with IDS and considerable improvement compared with algorithms without adaptive selection.

OCSep 9, 2022
Stochastic Compositional Optimization with Compositional Constraints

Shuoguang Yang, Wei You, Zhe Zhang et al.

Stochastic compositional optimization (SCO) has attracted considerable attention because of its broad applicability to important real-world problems. However, existing works on SCO assume that the projection within a solution update is simple, which fails to hold for problem instances where the constraints are in the form of expectations, such as empirical conditional value-at-risk constraints. We study a novel model that incorporates single-level expected value and two-level compositional constraints into the current SCO framework. Our model can be applied widely to data-driven optimization and risk management, including risk-averse optimization and high-moment portfolio selection, and can handle multiple constraints. We further propose a class of primal-dual algorithms that generates sequences converging to the optimal solution at the rate of $\cO(\frac{1}{\sqrt{N}})$under both single-level expected value and two-level compositional constraints, where $N$ is the iteration counter, establishing the benchmarks in expected value constrained SCO.

LGOct 10, 2023
Federated Multi-Level Optimization over Decentralized Networks

Shuoguang Yang, Xuezhou Zhang, Mengdi Wang

Multi-level optimization has gained increasing attention in recent years, as it provides a powerful framework for solving complex optimization problems that arise in many fields, such as meta-learning, multi-player games, reinforcement learning, and nested composition optimization. In this paper, we study the problem of distributed multi-level optimization over a network, where agents can only communicate with their immediate neighbors. This setting is motivated by the need for distributed optimization in large-scale systems, where centralized optimization may not be practical or feasible. To address this problem, we propose a novel gossip-based distributed multi-level optimization algorithm that enables networked agents to solve optimization problems at different levels in a single timescale and share information through network propagation. Our algorithm achieves optimal sample complexity, scaling linearly with the network size, and demonstrates state-of-the-art performance on various applications, including hyper-parameter tuning, decentralized reinforcement learning, and risk-averse optimization.

LGFeb 16, 2022
Data-Driven Minimax Optimization with Expectation Constraints

Shuoguang Yang, Xudong Li, Guanghui Lan

Attention to data-driven optimization approaches, including the well-known stochastic gradient descent method, has grown significantly over recent decades, but data-driven constraints have rarely been studied, because of the computational challenges of projections onto the feasible set defined by these hard constraints. In this paper, we focus on the non-smooth convex-concave stochastic minimax regime and formulate the data-driven constraints as expectation constraints. The minimax expectation constrained problem subsumes a broad class of real-world applications, including two-player zero-sum game and data-driven robust optimization. We propose a class of efficient primal-dual algorithms to tackle the minimax expectation-constrained problem, and show that our algorithms converge at the optimal rate of $\mathcal{O}(\frac{1}{\sqrt{N}})$. We demonstrate the practical efficiency of our algorithms by conducting numerical experiments on large-scale real-world applications.

LGJan 5, 2022
Bridging Adversarial and Nonstationary Multi-armed Bandit

Ningyuan Chen, Shuoguang Yang, Hailun Zhang

In the multi-armed bandit framework, there are two formulations that are commonly employed to handle time-varying reward distributions: adversarial bandit and nonstationary bandit. Although their oracles, algorithms, and regret analysis differ significantly, we provide a unified formulation in this paper that smoothly bridges the two as special cases. The formulation uses an oracle that takes the best-fixed arm within time windows. Depending on the window size, it turns into the oracle in hindsight in the adversarial bandit and dynamic oracle in the nonstationary bandit. We provide algorithms that attain the optimal regret with the matching lower bound.

SISep 6, 2021
Online Learning of Independent Cascade Models with Node-level Feedback

Shuoguang Yang, Van-Anh Truong

We propose a detailed analysis of the online-learning problem for Independent Cascade (IC) models under node-level feedback. These models have widespread applications in modern social networks. Existing works for IC models have only shed light on edge-level feedback models, where the agent knows the explicit outcome of every observed edge. Little is known about node-level feedback models, where only combined outcomes for sets of edges are observed; in other words, the realization of each edge is censored. This censored information, together with the nonlinear form of the aggregated influence probability, make both parameter estimation and algorithm design challenging. We establish the first confidence-region result under this setting. We also develop an online algorithm achieving a cumulative regret of $\mathcal{O}( \sqrt{T})$, matching the theoretical regret bound for IC models with edge-level feedback.

LGDec 7, 2020
Revenue Maximization and Learning in Products Ranking

Ningyuan Chen, Anran Li, Shuoguang Yang

We consider the revenue maximization problem for an online retailer who plans to display in order a set of products differing in their prices and qualities. Consumers have attention spans, i.e., the maximum number of products they are willing to view, and inspect the products sequentially before purchasing a product or leaving the platform empty-handed when the attention span gets exhausted. Our framework extends the well-known cascade model in two directions: the consumers have random attention spans instead of fixed ones, and the firm maximizes revenues instead of clicking probabilities. We show a nested structure of the optimal product ranking as a function of the attention span when the attention span is fixed. \sg{Using this fact, we develop an approximation algorithm when only the distribution of the attention spans is given. Under mild conditions, it achieves $1/e$ of the revenue of the clairvoyant case when the realized attention span is known. We also show that no algorithms can achieve more than 0.5 of the revenue of the same benchmark. The model and the algorithm can be generalized to the ranking problem when consumers make multiple purchases.} When the conditional purchase probabilities are not known and may depend on consumer and product features, we devise an online learning algorithm that achieves $\tilde{\mathcal{O}}(\sqrt{T})$ regret relative to the approximation algorithm, despite the censoring of information: the attention span of a customer who purchases an item is not observable. Numerical experiments demonstrate the outstanding performance of the approximation and online learning algorithms.

LGApr 24, 2020
Online Learning with Cumulative Oversampling: Application to Budgeted Influence Maximization

Shatian Wang, Shuoguang Yang, Zhen Xu et al.

We propose a cumulative oversampling (CO) method for online learning. Our key idea is to sample parameter estimations from the updated belief space once in each round (similar to Thompson Sampling), and utilize the cumulative samples up to the current round to construct optimistic parameter estimations that asymptotically concentrate around the true parameters as tighter upper confidence bounds compared to the ones constructed with standard UCB methods. We apply CO to a novel budgeted variant of the Influence Maximization (IM) semi-bandits with linear generalization of edge weights, whose offline problem is NP-hard. Combining CO with the oracle we design for the offline problem, our online learning algorithm simultaneously tackles budget allocation, parameter learning, and reward maximization. We show that for IM semi-bandits, our CO-based algorithm achieves a scaled regret comparable to that of the UCB-based algorithms in theory, and performs on par with Thompson Sampling in numerical experiments.

SINov 8, 2019
Online Learning and Optimization Under a New Linear-Threshold Model with Negative Influence

Shuoguang Yang, Shatian Wang, Van-Anh Truong

Problem definition: Corporate brands, grassroots activists, and ordinary citizens all routinely employ Word-of-mouth (WoM) diffusion to promote products and instigate social change. Our work models the formation and spread of negative attitudes via WoM on a social network represented by a directed graph. In an online learning setting, we examine how an agent could simultaneously learn diffusion parameters and choose sets of seed users to initiate diffusions and maximize positive influence. In contrast to edge-level feedback, in which an agent observes the relationship (edge) through which a user (node) is influenced, we more realistically assume node-level feedback, where an agent only observes when a user is influenced and whether that influence is positive or negative. Methodology/results: We propose a new class of negativity-aware Linear Threshold Models. We show that in these models, the expected positive influence spread is a monotone submodular function of the seed set. Therefore, when maximizing positive influence by selecting a seed set of fixed size, a greedy algorithm can guarantee a solution with a constant approximation ratio. For the online learning setting, we propose an algorithm that runs in epochs of growing lengths, each consisting of a fixed number of exploration rounds followed by an increasing number of exploitation rounds controlled by a hyperparameter. Under mild assumptions, we show that our algorithm achieves asymptotic expected average scaled regret that is inversely related to any fractional constant power of the number of rounds. Managerial implications: During seed selection, our negativity-aware models and algorithms allow WoM campaigns to discover and best account for characteristics of local users and propagated content. We also give the first algorithms with regret guarantees for influence maximization under node-level feedback.