DSFeb 21, 2022
Learning Low Degree HypergraphsEric Balkanski, Oussama Hanguir, Shatian Wang
We study the problem of learning a hypergraph via edge detecting queries. In this problem, a learner queries subsets of vertices of a hidden hypergraph and observes whether these subsets contain an edge or not. In general, learning a hypergraph with $m$ edges of maximum size $d$ requires $Ω((2m/d)^{d/2})$ queries. In this paper, we aim to identify families of hypergraphs that can be learned without suffering from a query complexity that grows exponentially in the size of the edges. We show that hypermatchings and low-degree near-uniform hypergraphs with $n$ vertices are learnable with poly$(n)$ queries. For learning hypermatchings (hypergraphs of maximum degree $ 1$), we give an $O(\log^3 n)$-round algorithm with $O(n \log^5 n)$ queries. We complement this upper bound by showing that there are no algorithms with poly$(n)$ queries that learn hypermatchings in $o(\log \log n)$ adaptive rounds. For hypergraphs with maximum degree $Δ$ and edge size ratio $ρ$, we give a non-adaptive algorithm with $O((2n)^{ρΔ+1}\log^2 n)$ queries. To the best of our knowledge, these are the first algorithms with poly$(n, m)$ query complexity for learning non-trivial families of hypergraphs that have a super-constant number of edges of super-constant size.
LGApr 24, 2020
Online Learning with Cumulative Oversampling: Application to Budgeted Influence MaximizationShatian Wang, Shuoguang Yang, Zhen Xu et al.
We propose a cumulative oversampling (CO) method for online learning. Our key idea is to sample parameter estimations from the updated belief space once in each round (similar to Thompson Sampling), and utilize the cumulative samples up to the current round to construct optimistic parameter estimations that asymptotically concentrate around the true parameters as tighter upper confidence bounds compared to the ones constructed with standard UCB methods. We apply CO to a novel budgeted variant of the Influence Maximization (IM) semi-bandits with linear generalization of edge weights, whose offline problem is NP-hard. Combining CO with the oracle we design for the offline problem, our online learning algorithm simultaneously tackles budget allocation, parameter learning, and reward maximization. We show that for IM semi-bandits, our CO-based algorithm achieves a scaled regret comparable to that of the UCB-based algorithms in theory, and performs on par with Thompson Sampling in numerical experiments.
SINov 8, 2019
Online Learning and Optimization Under a New Linear-Threshold Model with Negative InfluenceShuoguang Yang, Shatian Wang, Van-Anh Truong
Problem definition: Corporate brands, grassroots activists, and ordinary citizens all routinely employ Word-of-mouth (WoM) diffusion to promote products and instigate social change. Our work models the formation and spread of negative attitudes via WoM on a social network represented by a directed graph. In an online learning setting, we examine how an agent could simultaneously learn diffusion parameters and choose sets of seed users to initiate diffusions and maximize positive influence. In contrast to edge-level feedback, in which an agent observes the relationship (edge) through which a user (node) is influenced, we more realistically assume node-level feedback, where an agent only observes when a user is influenced and whether that influence is positive or negative. Methodology/results: We propose a new class of negativity-aware Linear Threshold Models. We show that in these models, the expected positive influence spread is a monotone submodular function of the seed set. Therefore, when maximizing positive influence by selecting a seed set of fixed size, a greedy algorithm can guarantee a solution with a constant approximation ratio. For the online learning setting, we propose an algorithm that runs in epochs of growing lengths, each consisting of a fixed number of exploration rounds followed by an increasing number of exploitation rounds controlled by a hyperparameter. Under mild assumptions, we show that our algorithm achieves asymptotic expected average scaled regret that is inversely related to any fractional constant power of the number of rounds. Managerial implications: During seed selection, our negativity-aware models and algorithms allow WoM campaigns to discover and best account for characteristics of local users and propagated content. We also give the first algorithms with regret guarantees for influence maximization under node-level feedback.
LGNov 8, 2019
Beyond Adaptive Submodularity: Adaptive Influence Maximization with Intermediary ConstraintsShatian Wang, Zhen Xu, Van-Anh Truong
We consider a brand with a given budget that wants to promote a product over multiple rounds of influencer marketing. In each round, it commissions an influencer to promote the product over a social network, and then observes the subsequent diffusion of the product before adaptively choosing the next influencer to commission. This process terminates when the budget is exhausted. We assume that the diffusion process follows the popular Independent Cascade model. We also consider an online learning setting, where the brand initially does not know the diffusion parameters associated with the model, and has to gradually learn the parameters over time. Unlike in existing models, the rounds in our model are correlated through an intermediary constraint: each user can be commissioned for an unlimited number of times. However, each user will spread influence without commission at most once. Due to this added constraint, the order in which the influencers are chosen can change the influence spread, making obsolete existing analysis techniques that based on the notion of adaptive submodularity. We devise a sample path analysis to prove that a greedy policy that knows the diffusion parameters achieves at least $1-1/e - ε$ times the expected reward of the optimal policy. In the online-learning setting, we are the first to consider a truly adaptive decision making framework, rather than assuming independent epochs, and adaptivity only within epochs. Under mild assumptions, we derive a regret bound for our algorithm. In our numerical experiments, we simulate information diffusions on four Twitter sub-networks, and compare our UCB-based learning algorithms with several baseline adaptive seeding strategies. Our learning algorithm consistently outperforms the baselines and achieves rewards close to the greedy policy that knows the true diffusion parameters.