LGSep 10, 2020
Learning Product Rankings Robust to Fake UsersNegin Golrezaei, Vahideh Manshadi, Jon Schneider et al.
In many online platforms, customers' decisions are substantially influenced by product rankings as most customers only examine a few top-ranked products. Concurrently, such platforms also use the same data corresponding to customers' actions to learn how these products must be ranked or ordered. These interactions in the underlying learning process, however, may incentivize sellers to artificially inflate their position by employing fake users, as exemplified by the emergence of click farms. Motivated by such fraudulent behavior, we study the ranking problem of a platform that faces a mixture of real and fake users who are indistinguishable from one another. We first show that existing learning algorithms---that are optimal in the absence of fake users---may converge to highly sub-optimal rankings under manipulation by fake users. To overcome this deficiency, we develop efficient learning algorithms under two informational environments: in the first setting, the platform is aware of the number of fake users, and in the second setting, it is agnostic to the number of fake users. For both these environments, we prove that our algorithms converge to the optimal ranking, while being robust to the aforementioned fraudulent behavior; we also present worst-case performance guarantees for our methods, and show that they significantly outperform existing algorithms. At a high level, our work employs several novel approaches to guarantee robustness such as: (i) constructing product-ordering graphs that encode the pairwise relationships between products inferred from the customers' actions; and (ii) implementing multiple levels of learning with a judicious amount of bi-directional cross-learning between levels.
LGJul 6, 2018
Combinatorial Bandits for Incentivizing Agents with Dynamic PreferencesTanner Fiez, Shreyas Sekar, Liyuan Zheng et al.
The design of personalized incentives or recommendations to improve user engagement is gaining prominence as digital platform providers continually emerge. We propose a multi-armed bandit framework for matching incentives to users, whose preferences are unknown a priori and evolving dynamically in time, in a resource constrained environment. We design an algorithm that combines ideas from three distinct domains: (i) a greedy matching paradigm, (ii) the upper confidence bound algorithm (UCB) for bandits, and (iii) mixing times from the theory of Markov chains. For this algorithm, we provide theoretical bounds on the regret and demonstrate its performance via both synthetic and realistic (matching supply and demand in a bike-sharing platform) examples.
LGMar 11, 2018
Multi-Armed Bandits for Correlated Markovian Environments with Smoothed Reward FeedbackTanner Fiez, Shreyas Sekar, Lillian J. Ratliff
We study a multi-armed bandit problem in a dynamic environment where arm rewards evolve in a correlated fashion according to a Markov chain. Different than much of the work on related problems, in our formulation a learning algorithm does not have access to either a priori information or observations of the state of the Markov chain and only observes smoothed reward feedback following time intervals we refer to as epochs. We demonstrate that existing methods such as UCB and $\varepsilon$-greedy can suffer linear regret in such an environment. Employing mixing-time bounds on Markov chains, we develop algorithms called EpochUCB and EpochGreedy that draw inspiration from the aforementioned methods, yet which admit sublinear regret guarantees for the problem formulation. Our proposed algorithms proceed in epochs in which an arm is played repeatedly for a number of iterations that grows linearly as a function of the number of times an arm has been played in the past. We analyze these algorithms under two types of smoothed reward feedback at the end of each epoch: a reward that is the discount-average of the discounted rewards within an epoch, and a reward that is the time-average of the rewards within an epoch.
GTOct 13, 2016
Truthful Mechanisms for Matching and Clustering in an Ordinal WorldElliot Anshelevich, Shreyas Sekar
We study truthful mechanisms for matching and related problems in a partial information setting, where the agents' true utilities are hidden, and the algorithm only has access to ordinal preference information. Our model is motivated by the fact that in many settings, agents cannot express the numerical values of their utility for different outcomes, but are still able to rank the outcomes in their order of preference. Specifically, we study problems where the ground truth exists in the form of a weighted graph of agent utilities, but the algorithm can only elicit the agents' private information in the form of a preference ordering for each agent induced by the underlying weights. Against this backdrop, we design truthful algorithms to approximate the true optimum solution with respect to the hidden weights. Our techniques yield universally truthful algorithms for a number of graph problems: a 1.76-approximation algorithm for Max-Weight Matching, 2-approximation algorithm for Max k-matching, a 6-approximation algorithm for Densest k-subgraph, and a 2-approximation algorithm for Max Traveling Salesman as long as the hidden weights constitute a metric. We also provide improved approximation algorithms for such problems when the agents are not able to lie about their preferences. Our results are the first non-trivial truthful approximation algorithms for these problems, and indicate that in many situations, we can design robust algorithms even when the agents may lie and only provide ordinal information instead of precise utilities.
GTDec 17, 2015
Blind, Greedy, and Random: Ordinal Approximation Algorithms for Matching and ClusteringElliot Anshelevich, Shreyas Sekar
We study Matching and other related problems in a partial information setting where the agents' utilities for being matched to other agents are hidden and the mechanism only has access to ordinal preference information. Our model is motivated by the fact that in many settings, agents cannot express the numerical values of their utility for different outcomes, but are still able to rank the outcomes in their order of preference. Specifically, we study problems where the ground truth exists in the form of a weighted graph, and look to design algorithms that approximate the true optimum matching using only the preference orderings for each agent (induced by the hidden weights) as input. If no restrictions are placed on the weights, then one cannot hope to do better than the simple greedy algorithm, which yields a half optimal matching. Perhaps surprisingly, we show that by imposing a little structure on the weights, we can improve upon the trivial algorithm significantly: we design a 1.6-approximation algorithm for instances where the hidden weights obey the metric inequality. Using our algorithms for matching as a black-box, we also design new approximation algorithms for other closely related problems: these include a a 3.2-approximation for the problem of clustering agents into equal sized partitions, a 4-approximation algorithm for Densest k-subgraph, and a 2.14-approximation algorithm for Max TSP. These results are the first non-trivial ordinal approximation algorithms for such problems, and indicate that we can design robust algorithms even when we are agnostic to the precise agent utilities.
GTAug 27, 2015
Computing Stable Coalitions: Approximation Algorithms for Reward SharingElliot Anshelevich, Shreyas Sekar
Consider a setting where selfish agents are to be assigned to coalitions or projects from a fixed set P. Each project k is characterized by a valuation function; v_k(S) is the value generated by a set S of agents working on project k. We study the following classic problem in this setting: "how should the agents divide the value that they collectively create?". One traditional approach in cooperative game theory is to study core stability with the implicit assumption that there are infinite copies of one project, and agents can partition themselves into any number of coalitions. In contrast, we consider a model with a finite number of non-identical projects; this makes computing both high-welfare solutions and core payments highly non-trivial. The main contribution of this paper is a black-box mechanism that reduces the problem of computing a near-optimal core stable solution to the purely algorithmic problem of welfare maximization; we apply this to compute an approximately core stable solution that extracts one-fourth of the optimal social welfare for the class of subadditive valuations. We also show much stronger results for several popular sub-classes: anonymous, fractionally subadditive, and submodular valuations, as well as provide new approximation algorithms for welfare maximization with anonymous functions. Finally, we establish a connection between our setting and the well-studied simultaneous auctions with item bidding; we adapt our results to compute approximate pure Nash equilibria for these auctions.