R. Ravi

LG
14papers
180citations
Novelty50%
AI Score54

14 Papers

COMay 30
A Min-Max Relation on Dicuts and Dijoins in Weighted Chordal Digraphs

Gérard Cornuéjols, Siyue Liu, R. Ravi

In a digraph, a dicut is a cut where all the arcs cross in one direction. A dijoin is a subset of arcs that intersects every dicut. Edmonds and Giles conjectured that in a weighted digraph, the minimum weight of a dicut is equal to the maximum size of a packing of dijoins. This has been disproved. However, the unweighted version conjectured by Woodall remains open. We prove that the Edmonds-Giles conjecture is true if the underlying undirected graph is chordal. We also give a strongly polynomial-time algorithm to construct such a packing.

MAMay 9, 2022
Informed Steiner Trees: Sampling and Pruning for Multi-Goal Path Finding in High Dimensions

Nikhil Chandak, Kenny Chour, Sivakumar Rathinam et al.

We interleave sampling based motion planning methods with pruning ideas from minimum spanning tree algorithms to develop a new approach for solving a Multi-Goal Path Finding (MGPF) problem in high dimensional spaces. The approach alternates between sampling points from selected regions in the search space and de-emphasizing regions that may not lead to good solutions for MGPF. Our approach provides an asymptotic, 2-approximation guarantee for MGPF. We also present extensive numerical results to illustrate the advantages of our proposed approach over uniform sampling in terms of the quality of the solutions found and computation speed.

DSApr 30
Perfect Fractional Matchings in Bipartite Graphs Via Proportional Allocations

Daniel Hathcock, R. Ravi

Given a bipartite graph that has a perfect matching, a prefect proportional allocation is an assignment of positive weights to the nodes of the right partition so that every left node is fractionally assigned to its neighbors in proportion to their weights, and these assignments define a fractional perfect matching. We prove that a bipartite graph has a perfect proportional allocation if and only if it is matching covered, by using a classical result on matrix scaling. We also present an extension of this result to provide a simple allocation strategy in non-matching covered bipartite graphs.

DSApr 30
The Telephone $k$-Multicast Problem

Daniel Hathcock, Guy Kortsarz, R. Ravi

We consider minimum time multicasting problems in directed and undirected graphs: given a root node and a subset of $t$ terminal nodes, multicasting seeks to find the minimum number of rounds within which all terminals can be informed with a message originating at the root. In each round, the telephone model we study allows the information to move via a matching from the informed nodes to the uninformed nodes. Since minimum time multicasting in digraphs is poorly understood compared to the undirected variant, we study an intermediate problem in undirected graphs that specifies a target $k < t$, and requires that only $k$ of the terminals be informed in the minimum number of rounds. For this problem, we improve the implications of the previous results and obtain a multiplicative approximation factor of $\tilde{O}(t^{1/3})$. For the directed version, we obtain an additive $\tilde{O}(k^{1/2})$ approximation algorithm (with a polylogarithmic multiplicative factor). Our algorithms are based on reductions to the related problems of finding $k$-trees of minimum poise (sum of maximum degree and diameter) and applying a combination of greedy network decomposition techniques and set covering under partition matroid constraints. We also study the problem of bounded degree Directed Steiner Tree, for which we obtain improved polylogarithmic approximations for the special case of bounded treewidth graphs. This extends prior work on the Group Steiner Tree problem.

DSMay 13
Improved Speed via Regional Fulfillment

Daniel Hathcock, R. Ravi, Amitabh Sinha

In e-retail, order fulfillment speed has become one of the most important metrics affecting customer satisfaction. While common wisdom dictates that maintaining a large global fulfillment network maximizes efficiency via economies of scale, recent evidence has shown that breaking up the network into smaller regions can yield significant speed improvements. In this paper, we consider a simple abstract model of order fulfillment by which we explain this phenomenon. We characterize fulfillment assignments satisfying an equilibrium condition based on the greedy fulfillment strategy, and quantify how the resulting fulfillment delay can be decreased by regionalizing the network. Finally, we provide some algorithmic results for computing low delay assignments, and some simulations supporting our equilibrium framework.

GTOct 17, 2025
How to Sell High-Dimensional Data Optimally

Andrew Li, R. Ravi, Karan Singh et al.

Motivated by the problem of selling large, proprietary data, we consider an information pricing problem proposed by Bergemann et al. that involves a decision-making buyer and a monopolistic seller. The seller has access to the underlying state of the world that determines the utility of the various actions the buyer may take. Since the buyer gains greater utility through better decisions resulting from more accurate assessments of the state, the seller can therefore promise the buyer supplemental information at a price. To contend with the fact that the seller may not be perfectly informed about the buyer's private preferences (or utility), we frame the problem of designing a data product as one where the seller designs a revenue-maximizing menu of statistical experiments. Prior work by Cai et al. showed that an optimal menu can be found in time polynomial in the state space, whereas we observe that the state space is naturally exponential in the dimension of the data. We propose an algorithm which, given only sampling access to the state space, provably generates a near-optimal menu with a number of samples independent of the state space. We then analyze a special case of high-dimensional Gaussian data, showing that (a) it suffices to consider scalar Gaussian experiments, (b) the optimal menu of such experiments can be found efficiently via a semidefinite program, and (c) full surplus extraction occurs if and only if a natural separation condition holds on the set of potential preferences of the buyer.

LGDec 23, 2023
Optimal Decision Tree and Adaptive Submodular Ranking with Noisy Outcomes

Su Jia, Fatemeh Navidi, Viswanath Nagarajan et al.

In pool-based active learning, the learner is given an unlabeled data set and aims to efficiently learn the unknown hypothesis by querying the labels of the data points. This can be formulated as the classical Optimal Decision Tree (ODT) problem: Given a set of tests, a set of hypotheses, and an outcome for each pair of test and hypothesis, our objective is to find a low-cost testing procedure (i.e., decision tree) that identifies the true hypothesis. This optimization problem has been extensively studied under the assumption that each test generates a deterministic outcome. However, in numerous applications, for example, clinical trials, the outcomes may be uncertain, which renders the ideas from the deterministic setting invalid. In this work, we study a fundamental variant of the ODT problem in which some test outcomes are noisy, even in the more general case where the noise is persistent, i.e., repeating a test gives the same noisy output. Our approximation algorithms provide guarantees that are nearly best possible and hold for the general case of a large number of noisy outcomes per test or per hypothesis where the performance degrades continuously with this number. We numerically evaluated our algorithms for identifying toxic chemicals and learning linear classifiers, and observed that our algorithms have costs very close to the information-theoretic minimum.

LGDec 23, 2023
Short-lived High-volume Multi-A(rmed)/B(andits) Testing

Su Jia, Andrew Li, R. Ravi et al.

Modern platforms leverage randomized experiments to make informed decisions from a given set of items (``treatments''). As a particularly challenging scenario, these items may (i) arrive in high volume, with thousands of new items being released per hour, and (ii) have short lifetime, say, due to the item's transient nature or underlying non-stationarity that impels the platform to perceive the same item as distinct copies over time. Thus motivated, we study a Bayesian multiple-play bandit problem that encapsulates the key features of the multivariate testing (or ``multi-A/B testing'') problem with a high volume of short-lived arms. In each round, a set of $k$ arms arrive, each available for $w$ rounds. Without knowing the mean reward for each arm, the learner selects a multiset of $n$ arms and immediately observes their realized rewards. We aim to minimize the loss due to not knowing the mean rewards, averaged over instances generated from a given prior distribution. We show that when $k = O(n^ρ)$ for some constant $ρ>0$, our proposed policy has $\tilde O(n^{-\min \{ρ, \frac 12 (1+\frac 1w)^{-1}\}})$ loss on a sufficiently large class of prior distributions. We complement this result by showing that every policy suffers $Ω(n^{-\min \{ρ, \frac 12\}})$ loss on the same class of distributions. We further validate the effectiveness of our policy through a large-scale field experiment on {\em Glance}, a content-card-serving platform that faces exactly the above challenge. A simple variant of our policy outperforms the platform's current recommender by 4.32\% in total duration and 7.48\% in total number of click-throughs.

LGDec 23, 2023
Markdown Pricing Under an Unknown Parametric Demand Model

Su Jia, Andrew Li, R. Ravi

Consider a single-product revenue-maximization problem where the seller monotonically decreases the price in $n$ rounds with an unknown demand model coming from a given family. Without monotonicity, the minimax regret is $\tilde O(n^{2/3})$ for the Lipschitz demand family and $\tilde O(n^{1/2})$ for a general class of parametric demand models. With monotonicity, the minimax regret is $\tilde O(n^{3/4})$ if the revenue function is Lipschitz and unimodal. However, the minimax regret for parametric families remained open. In this work, we provide a complete settlement for this fundamental problem. We introduce the crossing number to measure the complexity of a family of demand functions. In particular, the family of degree-$k$ polynomials has a crossing number $k$. Based on conservatism under uncertainty, we present (i) a policy with an optimal $Θ(\log^2 n)$ regret for families with crossing number $k=0$, and (ii) another policy with an optimal $\tilde Θ(n^{k/(k+1)})$ regret when $k\ge 1$. These bounds are asymptotically higher than the $\tilde O(\log n)$ and $\tilde Θ(\sqrt n)$ minimax regret for the same families without the monotonicity constraint.

LGNov 23, 2020
Learnable and Instance-Robust Predictions for Online Matching, Flows and Load Balancing

Thomas Lavastida, Benjamin Moseley, R. Ravi et al.

We propose a new model for augmenting algorithms with predictions by requiring that they are formally learnable and instance robust. Learnability ensures that predictions can be efficiently constructed from a reasonable amount of past data. Instance robustness ensures that the prediction is robust to modest changes in the problem input, where the measure of the change may be problem specific. Instance robustness insists on a smooth degradation in performance as a function of the change. Ideally, the performance is never worse than worst-case bounds. This also allows predictions to be objectively compared. We design online algorithms with predictions for a network flow allocation problem and restricted assignment makespan minimization. For both problems, two key properties are established: high quality predictions can be learned from a small sample of prior instances and these predictions are robust to errors that smoothly degrade as the underlying problem instance changes.

LGJun 10, 2019
Stretching the Effectiveness of MLE from Accuracy to Bias for Pairwise Comparisons

Jingyan Wang, Nihar B. Shah, R. Ravi

A number of applications (e.g., AI bot tournaments, sports, peer grading, crowdsourcing) use pairwise comparison data and the Bradley-Terry-Luce (BTL) model to evaluate a given collection of items (e.g., bots, teams, students, search results). Past work has shown that under the BTL model, the widely-used maximum-likelihood estimator (MLE) is minimax-optimal in estimating the item parameters, in terms of the mean squared error. However, another important desideratum for designing estimators is fairness. In this work, we consider fairness modeled by the notion of bias in statistics. We show that the MLE incurs a suboptimal rate in terms of bias. We then propose a simple modification to the MLE, which "stretches" the bounding box of the maximum-likelihood optimizer by a small constant factor from the underlying ground truth domain. We show that this simple modification leads to an improved rate in bias, while maintaining minimax-optimality in the mean squared error. In this manner, our proposed class of estimators provably improves fairness represented by bias without loss in accuracy.

IRNov 30, 2018
A new system-wide diversity measure for recommendations with efficient algorithms

Arda Antikacioglu, Tanvi Bajpai, R. Ravi

Recommender systems often operate on item catalogs clustered by genres, and user bases that have natural clusterings into user types by demographic or psychographic attributes. Prior work on system-wide diversity has mainly focused on defining intent-aware metrics among such categories and maximizing relevance of the resulting recommendations, but has not combined the notions of diversity from the two point of views of items and users. In this work, (1) we introduce two new system-wide diversity metrics to simultaneously address the problems of diversifying the categories of items that each user sees, diversifying the types of users that each item is shown, and maintaining high recommendation quality. We model this as a subgraph selection problem on the bipartite graph of candidate recommendations between users and items. (2) In the case of disjoint item categories and user types, we show that the resulting problems can be solved exactly in polynomial time, by a reduction to a minimum cost flow problem. (3) In the case of non-disjoint categories and user types, we prove NP-completeness of the objective and present efficient approximation algorithms using the submodularity of the objective. (4) Finally, we validate the effectiveness of our algorithms on the MovieLens-1m and Netflix datasets, and show that algorithms designed for our objective also perform well on sales diversity metrics, and even some intent-aware diversity metrics. Our experimental results justify the validity of our new composite diversity metrics.

IRSep 6, 2014
Recommendation Subgraphs for Web Discovery

Arda Antikacioglu, R. Ravi, Srinath Srihdar

Recommendations are central to the utility of many websites including YouTube, Quora as well as popular e-commerce stores. Such sites typically contain a set of recommendations on every product page that enables visitors to easily navigate the website. Choosing an appropriate set of recommendations at each page is one of the key features of backend engines that have been deployed at several e-commerce sites. Specifically at BloomReach, an engine consisting of several independent components analyzes and optimizes its clients' websites. This paper focuses on the structure optimizer component which improves the website navigation experience that enables the discovery of novel content. We begin by formalizing the concept of recommendations used for discovery. We formulate this as a natural graph optimization problem which in its simplest case, reduces to a bipartite matching problem. In practice, solving these matching problems requires superlinear time and is not scalable. Also, implementing simple algorithms is critical in practice because they are significantly easier to maintain in production. This motivated us to analyze three methods for solving the problem in increasing order of sophistication: a sampling algorithm, a greedy algorithm and a more involved partitioning based algorithm. We first theoretically analyze the performance of these three methods on random graph models characterizing when each method will yield a solution of sufficient quality and the parameter ranges when more sophistication is needed. We complement this by providing an empirical analysis of these algorithms on simulated and real-world production data. Our results confirm that it is not always necessary to implement complicated algorithms in the real-world and that very good practical results can be obtained by using heuristics that are backed by the confidence of concrete theoretical guarantees.

DSApr 26, 2012
Geometry of Online Packing Linear Programs

Marco Molinaro, R. Ravi

We consider packing LP's with $m$ rows where all constraint coefficients are normalized to be in the unit interval. The n columns arrive in random order and the goal is to set the corresponding decision variables irrevocably when they arrive so as to obtain a feasible solution maximizing the expected reward. Previous (1 - ε)-competitive algorithms require the right-hand side of the LP to be Omega((m/ε^2) log (n/ε)), a bound that worsens with the number of columns and rows. However, the dependence on the number of columns is not required in the single-row case and known lower bounds for the general case are also independent of n. Our goal is to understand whether the dependence on n is required in the multi-row case, making it fundamentally harder than the single-row version. We refute this by exhibiting an algorithm which is (1 - ε)-competitive as long as the right-hand sides are Omega((m^2/ε^2) log (m/ε)). Our techniques refine previous PAC-learning based approaches which interpret the online decisions as linear classifications of the columns based on sampled dual prices. The key ingredient of our improvement comes from a non-standard covering argument together with the realization that only when the columns of the LP belong to few 1-d subspaces we can obtain small such covers; bounding the size of the cover constructed also relies on the geometry of linear classifiers. General packing LP's are handled by perturbing the input columns, which can be seen as making the learning problem more robust.