Lirong Xia

h-index2

41papers

570citations

Novelty50%

AI Score44

Ranked #74,978 of 201,326 authors (top 37%)#162 in GT (top 42%)

41 Papers

GTJun 27, 2022

Differentially Private Condorcet Voting

Zhechen Li, Ao Liu, Lirong Xia et al.

Designing private voting rules is an important and pressing problem for trustworthy democracy. In this paper, under the framework of differential privacy, we propose a novel famliy of randomized voting rules based on the well-known Condorcet method, and focus on three classes of voting rules in this family: Laplacian Condorcet method ($\CMLAP_λ$), exponential Condorcet method ($\CMEXP_λ$), and randomized response Condorcet method ($\CMRR_λ$), where $λ$ represents the level of noise. We prove that all of our rules satisfy absolute monotonicity, lexi-participation, probabilistic Pareto efficiency, approximate probabilistic Condorcet criterion, and approximate SD-strategyproofness. In addition, $\CMRR_λ$ satisfies (non-approximate) probabilistic Condorcet criterion, while $\CMLAP_λ$ and $\CMEXP_λ$ satisfy strong lexi-participation. Finally, we regard differential privacy as a voting axiom, and discuss its relations to other axioms.

GTMay 30, 2022

Most Equitable Voting Rules

Lirong Xia

In social choice theory, anonymity (all agents being treated equally) and neutrality (all alternatives being treated equally) are widely regarded as ``minimal demands'' and ``uncontroversial'' axioms of equity and fairness. However, the ANR impossibility -- there is no voting rule that satisfies anonymity, neutrality, and resolvability (always choosing one winner) -- holds even in the simple setting of two alternatives and two agents. How to design voting rules that optimally satisfy anonymity, neutrality, and resolvability remains an open question. We address the optimal design question for a wide range of preferences and decisions that include ranked lists and committees. Our conceptual contribution is a novel and strong notion of most equitable refinements that optimally preserves anonymity and neutrality for any irresolute rule that satisfies the two axioms. Our technical contributions are twofold. First, we characterize the conditions for the ANR impossibility to hold under general settings, especially when the number of agents is large. Second, we propose the most-favorable-permutation (MFP) tie-breaking to compute a most equitable refinement and design a polynomial-time algorithm to compute MFP when agents' preferences are full rankings.

CLOct 12, 2023

LLM-augmented Preference Learning from Natural Language

Inwon Kang, Sikai Ruan, Tyler Ho et al.

Finding preferences expressed in natural language is an important but challenging task. State-of-the-art(SotA) methods leverage transformer-based models such as BERT, RoBERTa, etc. and graph neural architectures such as graph attention networks. Since Large Language Models (LLMs) are equipped to deal with larger context lengths and have much larger model sizes than the transformer-based model, we investigate their ability to classify comparative text directly. This work aims to serve as a first step towards using LLMs for the CPC task. We design and conduct a set of experiments that format the classification task into an input prompt for the LLM and a methodology to get a fixed-format response that can be automatically evaluated. Comparing performances with existing methods, we see that pre-trained LLMs are able to outperform the previous SotA models with no fine-tuning involved. Our results show that the LLMs can consistently outperform the SotA when the target text is large -- i.e. composed of multiple sentences --, and are still comparable to the SotA performance in shorter text. We also find that few-shot learning yields better performance than zero-shot learning.

LGMay 21

Mixture of Complementary Agents for Robust LLM Ensemble

Yichi Zhang, Kevin Lu, Yuang Zhang et al.

Multi-AI collaboration, such as ensembling or debating large language models (LLMs), is a promising paradigm for aggregating information and boosting performance. A foundational step in these pipelines is to feed the responses of several proposer LLMs into a summarizer LLM, which synthesizes a better answer. However, choosing which proposers to include is non-trivial. Existing approaches primarily focus either on accuracy (picking the strongest models) or diversity (ensuring variety), and often overlook the interactions among proposers and with the summarizer. We reframe proposer selection as a combinatorial selection problem akin to feature selection, where the value of an LLM lies in its complementarity with others. However, directly applying standard feature-selection algorithms is impractical in the LLM setting due to prohibitive time complexity. Motivated by this limitation, we explore an extensive range of computationally feasible, greedy-style selection algorithms that assess complementarity using a small labeled set. Our experiments validate complementarity as a guiding principle for proposer selection and identify methods that achieve the best performance-cost trade-offs in practice.

MAMay 28, 2020Code

OPRA: An Open-Source Online Preference Reporting and Aggregation System

Yiwei Chen, Jingwen Qian, Junming Wang et al.

We introduce the Online Preference Reporting and Aggregation (OPRA) system, an open-source online system that aims at providing support for group decision-making. We illustrate OPRA's distinctive features: UI for reporting rankings with ties, comprehensive analytics of preferences, and group decision-making in combinatorial domains. We also discuss our work in an automatic mentor matching system. We hope that the open-source nature of OPRA will foster the development of computerized group decision support systems.

GTOct 11, 2023

Determining Winners in Elections with Absent Votes

Qishen Han, Amélie Marian, Lirong Xia

An important question in elections is the determine whether a candidate can be a winner when some votes are absent. We study this determining winner with the absent votes (WAV) problem when the votes are top-truncated. We show that the WAV problem is NP-complete for the single transferable vote, Maximin, and Copeland, and propose a special case of positional scoring rule such that the problem can be computed in polynomial time. Our results in top-truncated rankings differ from the results in full rankings as their hardness results still hold when the number of candidates or the number of missing votes are bounded, while we show that the problem can be solved in polynomial time in either case.

GTMar 5, 2025

A Linear Theory of Multi-Winner Voting

Lirong Xia

We introduces a general linear framework that unifies the study of multi-winner voting rules and proportionality axioms, demonstrating that many prominent multi-winner voting rules-including Thiele methods, their sequential variants, and approval-based committee scoring rules-are linear. Similarly, key proportionality axioms such as Justified Representation (JR), Extended JR (EJR), and their strengthened variants (PJR+, EJR+), along with core stability, can fit within this linear structure as well. Leveraging PAC learning theory, we establish general and novel upper bounds on the sample complexity of learning linear mappings. Our approach yields near-optimal guarantees for diverse classes of rules, including Thiele methods and ordered weighted average rules, and can be applied to analyze the sample complexity of learning proportionality axioms such as approximate core stability. Furthermore, the linear structure allows us to leverage prior work to extend our analysis beyond worst-case scenarios to study the likelihood of various properties of linear rules and axioms. We introduce a broad class of distributions that extend Impartial Culture for approval preferences, and show that under these distributions, with high probability, any Thiele method is resolute, CORE is non-empty, and any Thiele method satisfies CORE, among other observations on the likelihood of commonly-studied properties in social choice. We believe that this linear theory offers a new perspective and powerful new tools for designing and analyzing multi-winner rules in modern social choice applications.

GTFeb 13, 2024

Average-Case Analysis of Iterative Voting

Joshua Kavner, Lirong Xia

Iterative voting is a natural model of repeated strategic decision-making in social choice theory when agents have the opportunity to update their votes prior to finalizing the group decision. Prior work has analyzed the efficacy of iterative plurality on the welfare of the chosen outcome at equilibrium, relative to the truthful vote profile, via an adaptation of the price of anarchy. However, prior analyses have only studied the worst- and average-case performances when agents' preferences are distributed by the impartial culture. This work extends average-case analysis comprehensively across three alternatives and distinguishes under which of agents' preference distributions iterative plurality improves or degrades asymptotic welfare.

GTMay 8, 2023

First-Choice Maximality Meets Ex-ante and Ex-post Fairness

Xiaoxi Guo, Sujoy Sikdar, Lirong Xia et al.

For the assignment problem where multiple indivisible items are allocated to a group of agents given their ordinal preferences, we design randomized mechanisms that satisfy first-choice maximality (FCM), i.e., maximizing the number of agents assigned their first choices, together with Pareto efficiency (PE). Our mechanisms also provide guarantees of ex-ante and ex-post fairness. The generalized eager Boston mechanism is ex-ante envy-free, and ex-post envy-free up to one item (EF1). The generalized probabilistic Boston mechanism is also ex-post EF1, and satisfies ex-ante efficiency instead of fairness. We also show that no strategyproof mechanism satisfies ex-post PE, EF1, and FCM simultaneously. In doing so, we expand the frontiers of simultaneously providing efficiency and both ex-ante and ex-post fairness guarantees for the assignment problem.

GTFeb 28, 2022

Anti-Malware Sandbox Games

Sujoy Sikdar, Sikai Ruan, Qishen Han et al.

We develop a game theoretic model of malware protection using the state-of-the-art sandbox method, to characterize and compute optimal defense strategies for anti-malware. We model the strategic interaction between developers of malware (M) and anti-malware (AM) as a two player game, where AM commits to a strategy of generating sandbox environments, and M responds by choosing to either attack or hide malicious activity based on the environment it senses. We characterize the condition for AM to protect all its machines, and identify conditions under which an optimal AM strategy can be computed efficiently. For other cases, we provide a quadratically constrained quadratic program (QCQP)-based optimization framework to compute the optimal AM strategy. In addition, we identify a natural and easy to compute strategy for AM, which as we show empirically, achieves AM utility that is close to the optimal AM utility, in equilibrium.

THFeb 13, 2022

The Impact of a Coalition: Assessing the Likelihood of Voter Influence in Large Elections

Lirong Xia

For centuries, it has been widely believed that the influence of a small coalition of voters is negligible in a large election. Consequently, there is a large body of literature on characterizing the likelihood for an election to be influenced when the votes follow certain distributions, especially the likelihood of being manipulable by a single voter under the i.i.d. uniform distribution, known as the Impartial Culture (IC). In this paper, we extend previous studies in three aspects: (1) we propose a more general semi-random model, where a distribution adversary chooses a worst-case distribution and then a contamination adversary modifies up to $ψ$ portion of the data, (2) we consider many coalitional influence problems, including coalitional manipulation, margin of victory, and various vote controls and bribery, and (3) we consider arbitrary and variable coalition size $B$. Our main theorem provides asymptotically tight bounds on the semi-random likelihood of the existence of a size-$B$ coalition that can successfully influence the election under a wide range of voting rules. Applications of the main theorem and its proof techniques resolve long-standing open questions about the likelihood of coalitional manipulability under IC, by showing that the likelihood is $Θ\left(\min\left\{\frac{B}{\sqrt n}, 1\right\}\right)$ for many commonly-studied voting rules. The main technical contribution is a characterization of the semi-random likelihood for a Poisson multinomial variable (PMV) to be unstable, which we believe to be a general and useful technique with independent interest.

GTSep 18, 2021

Favoring Eagerness for Remaining Items: Designing Efficient, Fair, and Strategyproof Mechanisms

Xiaoxi Guo, Sujoy Sikdar, Lirong Xia et al.

In the assignment problem, the goal is to assign indivisible items to agents who have ordinal preferences, efficiently and fairly, in a strategyproof manner. In practice, first-choice maximality, i.e., assigning a maximal number of agents their top items, is often identified as an important efficiency criterion and measure of agents' satisfaction. In this paper, we propose a natural and intuitive efficiency property, favoring-eagerness-for-remaining-items (FERI), which requires that each item is allocated to an agent who ranks it highest among remaining items, thereby implying first-choice maximality. Using FERI as a heuristic, we design mechanisms that satisfy ex-post or ex-ante variants of FERI together with combinations of other desirable properties of efficiency (Pareto-efficiency), fairness (strong equal treatment of equals and sd-weak-envy-freeness), and strategyproofness (sd-weak-strategyproofness). We also explore the limits of FERI mechanisms in providing stronger efficiency, fairness, or strategyproofness guarantees through impossibility results.

LGJul 4, 2021

Certifiably Robust Interpretation via Renyi Differential Privacy

Ao Liu, Xiaoyu Chen, Sijia Liu et al.

Motivated by the recent discovery that the interpretation maps of CNNs could easily be manipulated by adversarial attacks against network interpretability, we study the problem of interpretation robustness from a new perspective of \Renyi differential privacy (RDP). The advantages of our Renyi-Robust-Smooth (RDP-based interpretation method) are three-folds. First, it can offer provable and certifiable top-$k$ robustness. That is, the top-$k$ important attributions of the interpretation map are provably robust under any input perturbation with bounded $\ell_d$-norm (for any $d\geq 1$, including $d = \infty$). Second, our proposed method offers $\sim10\%$ better experimental robustness than existing approaches in terms of the top-$k$ attributions. Remarkably, the accuracy of Renyi-Robust-Smooth also outperforms existing approaches. Third, our method can provide a smooth tradeoff between robustness and computational efficiency. Experimentally, its top-$k$ attributions are {\em twice} more robust than existing approaches when the computational resources are highly constrained.

CRJul 4, 2021

Smoothed Differential Privacy

Ao Liu, Yu-Xiang Wang, Lirong Xia

Differential privacy (DP) is a widely-accepted and widely-applied notion of privacy based on worst-case analysis. Often, DP classifies most mechanisms without additive noise as non-private (Dwork et al., 2014). Thus, additive noises are added to improve privacy (to achieve DP). However, in many real-world applications, adding additive noise is undesirable (Bagdasaryan et al., 2019) and sometimes prohibited (Liu et al., 2020). In this paper, we propose a natural extension of DP following the worst average-case idea behind the celebrated smoothed analysis (Spielman & Teng, May 2004). Our notion, smoothed DP, can effectively measure the privacy leakage of mechanisms without additive noises under realistic settings. We prove that any discrete mechanism with sampling procedures is more private than what DP predicts, while many continuous mechanisms with sampling procedures are still non-private under smoothed DP. In addition, we prove several desirable properties of smoothed DP, including composition, robustness to post-processing, and distribution reduction. Based on those properties, we propose an efficient algorithm to calculate the privacy parameters for smoothed DP. Experimentally, we verify that, according to smoothed DP, the discrete sampling mechanisms are private in real-world elections, and some discrete neural networks can be private without adding any additive noise. We believe that these results contribute to the theoretical foundation of realistic privacy measures beyond worst-case analysis.

THJun 3, 2021

The Smoothed Satisfaction of Voting Axioms

Lirong Xia

We initiate the work towards a comprehensive picture of the smoothed satisfaction of voting axioms, to provide a finer and more realistic foundation for comparing voting rules. We adopt the smoothed social choice framework, where an adversary chooses arbitrarily correlated "ground truth" preferences for the agents, on top of which random noises are added. We focus on characterizing the smoothed satisfaction of two well-studied voting axioms: Condorcet criterion and participation. We prove that for any fixed number of alternatives, when the number of voters $n$ is sufficiently large, the smoothed satisfaction of the Condorcet criterion under a wide range of voting rules is $1$, $1-\exp(-Θ(n))$, $Θ(n^{-0.5})$, $ \exp(-Θ(n))$, or being $Θ(1)$ and $1-Θ(1)$ at the same time; and the smoothed satisfaction of participation is $1-Θ(n^{-0.5})$. Our results address open questions by Berg and Lepelley in 1994 for these rules, and also confirm the following high-level message: the Condorcet criterion is a bigger concern than participation under realistic models.

GTJan 29, 2021

Sequential Mechanisms for Multi-type Resource Allocation

Sujoy Sikdar, Xiaoxi Guo, Haibin Wang et al.

Several resource allocation problems involve multiple types of resources, with a different agency being responsible for "locally" allocating the resources of each type, while a central planner wishes to provide a guarantee on the properties of the final allocation given agents' preferences. We study the relationship between properties of the local mechanisms, each responsible for assigning all of the resources of a designated type, and the properties of a sequential mechanism which is composed of these local mechanisms, one for each type, applied sequentially, under lexicographic preferences, a well studied model of preferences over multiple types of resources in artificial intelligence and economics. We show that when preferences are O-legal, meaning that agents share a common importance order on the types, sequential mechanisms satisfy the desirable properties of anonymity, neutrality, non-bossiness, or Pareto-optimality if and only if every local mechanism also satisfies the same property, and they are applied sequentially according to the order O. Our main results are that under O-legal lexicographic preferences, every mechanism satisfying strategyproofness and a combination of these properties must be a sequential composition of local mechanisms that are also strategyproof, and satisfy the same combinations of properties.

STJun 19, 2020

Optimal Statistical Hypothesis Testing for Social Choice

Lirong Xia

We address the following question in this paper: "What are the most robust statistical methods for social choice?'' By leveraging the theory of uniformly least favorable distributions in the Neyman-Pearson framework to finite models and randomized tests, we characterize uniformly most powerful (UMP) tests, which is a well-accepted statistical optimality w.r.t. robustness, for testing whether a given alternative is the winner under Mallows' model and under Condorcet's model, respectively.

GTJun 11, 2020

The Smoothed Possibility of Social Choice

Lirong Xia

We develop a framework that leverages the smoothed complexity analysis by Spielman and Teng to circumvent paradoxes and impossibility theorems in social choice, motivated by modern applications of social choice powered by AI and ML. For Condrocet's paradox, we prove that the smoothed likelihood of the paradox either vanishes at an exponential rate as the number of agents increases, or does not vanish at all. For the ANR impossibility on the non-existence of voting rules that simultaneously satisfy anonymity, neutrality, and resolvability, we characterize the rate for the impossibility to vanish, to be either polynomially fast or exponentially fast. We also propose a novel easy-to-compute tie-breaking mechanism that optimally preserves anonymity and neutrality for even number of alternatives in natural settings. Our results illustrate the smoothed possibility of social choice -- even though the paradox and the impossibility theorem hold in the worst case, they may not be a big concern in practice.

LGJun 6, 2020

Learning Mixtures of Random Utility Models with Features from Incomplete Preferences

Zhibing Zhao, Ao Liu, Lirong Xia

Random Utility Models (RUMs), which subsume Plackett-Luce model (PL) as a special case, are among the most popular models for preference learning. In this paper, we consider RUMs with features and their mixtures, where each alternative has a vector of features, possibly different across agents. Such models significantly generalize the standard PL and RUMs, but are not as well investigated in the literature. We extend mixtures of RUMs with features to models that generate incomplete preferences and characterize their identifiability. For PL, we prove that when PL with features is identifiable, its MLE is consistent with a strictly concave objective function under mild assumptions, by characterizing a bound on root-mean-square-error (RMSE), which naturally leads to a sample complexity bound. We also characterize identifiability of more general RUMs with features and propose a generalized RBCML to learn them. Our experiments on synthetic data demonstrate the effectiveness of MLE on PL with features with tradeoffs between statistical efficiency and computational efficiency. Our experiments on real-world data show the prediction power of PL with features and its mixtures.

LGMay 17, 2020

Dual Learning: Theoretical Study and an Algorithmic Extension

Zhibing Zhao, Yingce Xia, Tao Qin et al.

Dual learning has been successfully applied in many machine learning applications including machine translation, image-to-image transformation, etc. The high-level idea of dual learning is very intuitive: if we map an $x$ from one domain to another and then map it back, we should recover the original $x$. Although its effectiveness has been empirically verified, theoretical understanding of dual learning is still very limited. In this paper, we aim at understanding why and when dual learning works. Based on our theoretical analysis, we further extend dual learning by introducing more related mappings and propose multi-step dual learning, in which we leverage feedback signals from additional domains to improve the qualities of the mappings. We prove that multi-step dual learn-ing can boost the performance of standard dual learning under mild conditions. Experiments on WMT 14 English$\leftrightarrow$German and MultiUNEnglish$\leftrightarrow$French translations verify our theoretical findings on dual learning, and the results on the translations among English, French, and Spanish of MultiUN demonstrate the effectiveness of multi-step dual learning.

GTApr 25, 2020

Probabilistic Serial Mechanism for Multi-Type Resource Allocation

Xiaoxi Guo, Sujoy Sikdar, Haibin Wang et al.

In multi-type resource allocation (MTRA) problems, there are p $\ge$ 2 types of items, and n agents, who each demand one unit of items of each type, and have strict linear preferences over bundles consisting of one item of each type. For MTRAs with indivisible items, our first result is an impossibility theorem that is in direct contrast to the single type (p = 1) setting: No mechanism, the output of which is always decomposable into a probability distribution over discrete assignments (where no item is split between agents), can satisfy both sd-efficiency and sd-envy-freeness. To circumvent this impossibility result, we consider the natural assumption of lexicographic preference, and provide an extension of the probabilistic serial (PS), called lexicographic probabilistic serial (LexiPS).We prove that LexiPS satisfies sd-efficiency and sd-envy-freeness, retaining the desirable properties of PS. Moreover, LexiPS satisfies sd-weak-strategyproofness when agents are not allowed to misreport their importance orders. For MTRAs with divisible items, we show that the existing multi-type probabilistic serial (MPS) mechanism satisfies the stronger efficiency notion of lexi-efficiency, and is sd-envy-free under strict linear preferences, and sd-weak-strategyproof under lexicographic preferences. We also prove that MPS can be characterized both by leximin-ptimality and by item-wise ordinal fairness, and the family of eating algorithms which MPS belongs to can be characterized by no-generalized-cycle condition.

LGOct 25, 2019

Learning Mixtures of Plackett-Luce Models from Structured Partial Orders

Zhibing Zhao, Lirong Xia

Mixtures of ranking models have been widely used for heterogeneous preferences. However, learning a mixture model is highly nontrivial, especially when the dataset consists of partial orders. In such cases, the parameter of the model may not be even identifiable. In this paper, we focus on three popular structures of partial orders: ranked top-$l_1$, $l_2$-way, and choice data over a subset of alternatives. We prove that when the dataset consists of combinations of ranked top-$l_1$ and $l_2$-way (or choice data over up to $l_2$ alternatives), mixture of $k$ Plackett-Luce models is not identifiable when $l_1+l_2\le 2k-1$ ($l_2$ is set to $1$ when there are no $l_2$-way orders). We also prove that under some combinations, including ranked top-$3$, ranked top-$2$ plus $2$-way, and choice data over up to $4$ alternatives, mixtures of two Plackett-Luce models are identifiable. Guided by our theoretical results, we propose efficient generalized method of moments (GMM) algorithms to learn mixtures of two Plackett-Luce models, which are proven consistent. Our experiments demonstrate the efficacy of our algorithms. Moreover, we show that when full rankings are available, learning from different marginal events (partial orders) provides tradeoffs between statistical efficiency and computational efficiency.

AIJun 13, 2019

Multi-type Resource Allocation with Partial Preferences

Haibin Wang, Sujoy Sikdar, Xiaoxi Guo et al.

We propose multi-type probabilistic serial (MPS) and multi-type random priority (MRP) as extensions of the well known PS and RP mechanisms to the multi-type resource allocation problem (MTRA) with partial preferences. In our setting, there are multiple types of divisible items, and a group of agents who have partial order preferences over bundles consisting of one item of each type. We show that for the unrestricted domain of partial order preferences, no mechanism satisfies both sd-efficiency and sd-envy-freeness. Notwithstanding this impossibility result, our main message is positive: When agents' preferences are represented by acyclic CP-nets, MPS satisfies sd-efficiency, sd-envy-freeness, ordinal fairness, and upper invariance, while MRP satisfies ex-post-efficiency, sd-strategy-proofness, and upper invariance, recovering the properties of PS and RP.

HCMay 27, 2019

Minimizing Time-to-Rank: A Learning and Recommendation Approach

Haoming Li, Sujoy Sikdar, Rohit Vaish et al.

Consider the following problem faced by an online voting platform: A user is provided with a list of alternatives, and is asked to rank them in order of preference using only drag-and-drop operations. The platform's goal is to recommend an initial ranking that minimizes the time spent by the user in arriving at her desired ranking. We develop the first optimization framework to address this problem, and make theoretical as well as practical contributions. On the practical side, our experiments on Amazon Mechanical Turk provide two interesting insights about user behavior: First, that users' ranking strategies closely resemble selection or insertion sort, and second, that the time taken for a drag-and-drop operation depends linearly on the number of positions moved. These insights directly motivate our theoretical model of the optimization problem. We show that computing an optimal recommendation is NP-hard, and provide exact and approximation algorithms for a variety of special cases of the problem. Experimental evaluation on MTurk shows that, compared to a random recommendation strategy, the proposed approach reduces the (average) time-to-rank by up to 50%.

AIMay 2, 2019

Frustratingly Easy Truth Discovery

Reshef Meir, Ofra Amir, Omer Ben-Porat et al.

Truth discovery is a general name for a broad range of statistical methods aimed to extract the correct answers to questions, based on multiple answers coming from noisy sources. For example, workers in a crowdsourcing platform. In this paper, we consider an extremely simple heuristic for estimating workers' competence using average proximity to other workers. We prove that this estimates well the actual competence level and enables separating high and low quality workers in a wide spectrum of domains and statistical models. Under Gaussian noise, this simple estimate is the unique solution to the MLE with a constant regularization factor. Finally, weighing workers according to their average proximity in a crowdsourcing setting, results in substantial improvement over unweighted aggregation and other truth discovery algorithms in practice.

CRApr 15, 2019

Differential Privacy for Eye-Tracking Data

Ao Liu, Lirong Xia, Andrew Duchowski et al.

As large eye-tracking datasets are created, data privacy is a pressing concern for the eye-tracking community. De-identifying data does not guarantee privacy because multiple datasets can be linked for inferences. A common belief is that aggregating individuals' data into composite representations such as heatmaps protects the individual. However, we analytically examine the privacy of (noise-free) heatmaps and show that they do not guarantee privacy. We further propose two noise mechanisms that guarantee privacy and analyze their privacy-utility tradeoff. Analysis reveals that our Gaussian noise mechanism is an elegant solution to preserve privacy for heatmaps. Our results have implications for interdisciplinary research to create differentially private mechanisms for eye tracking.

AIJan 16, 2019

Practical Algorithms for Multi-Stage Voting Rules with Parallel Universes Tiebreaking

Jun Wang, Sujoy Sikdar, Tyler Shepherd et al.

STV and ranked pairs (RP) are two well-studied voting rules for group decision-making. They proceed in multiple rounds, and are affected by how ties are broken in each round. However, the literature is surprisingly vague about how ties should be broken. We propose the first algorithms for computing the set of alternatives that are winners under some tiebreaking mechanism under STV and RP, which is also known as parallel-universes tiebreaking (PUT). Unfortunately, PUT-winners are NP-complete to compute under STV and RP, and standard search algorithms from AI do not apply. We propose multiple DFS-based algorithms along with pruning strategies, heuristics, sampling and machine learning to prioritize search direction to significantly improve the performance. We also propose novel ILP formulations for PUT-winners under STV and RP, respectively. Experiments on synthetic and real-world data show that our algorithms are overall faster than ILP.

LGJul 9, 2018

Towards Non-Parametric Learning to Rank

Ao Liu, Qiong Wu, Zhenming Liu et al.

This paper studies a stylized, yet natural, learning-to-rank problem and points out the critical incorrectness of a widely used nearest neighbor algorithm. We consider a model with $n$ agents (users) $\{x_i\}_{i \in [n]}$ and $m$ alternatives (items) $\{y_j\}_{j \in [m]}$, each of which is associated with a latent feature vector. Agents rank items nondeterministically according to the Plackett-Luce model, where the higher the utility of an item to the agent, the more likely this item will be ranked high by the agent. Our goal is to find neighbors of an arbitrary agent or alternative in the latent space. We first show that the Kendall-tau distance based kNN produces incorrect results in our model. Next, we fix the problem by introducing a new algorithm with features constructed from "global information" of the data matrix. Our approach is in sharp contrast to most existing feature engineering methods. Finally, we design another new algorithm identifying similar alternatives. The construction of alternative features can be done using "local information," highlighting the algorithmic difference between finding similar agents and similar alternatives.

LGJun 4, 2018

Composite Marginal Likelihood Methods for Random Utility Models

Zhibing Zhao, Lirong Xia

We propose a novel and flexible rank-breaking-then-composite-marginal-likelihood (RBCML) framework for learning random utility models (RUMs), which include the Plackett-Luce model. We characterize conditions for the objective function of RBCML to be strictly log-concave by proving that strict log-concavity is preserved under convolution and marginalization. We characterize necessary and sufficient conditions for RBCML to satisfy consistency and asymptotic normality. Experiments on synthetic data show that RBCML for Gaussian RUMs achieves better statistical efficiency and computational efficiency than the state-of-the-art algorithm and our RBCML for the Plackett-Luce model provides flexible tradeoffs between running time and statistical efficiency.

AIMay 17, 2018

Practical Algorithms for STV and Ranked Pairs with Parallel Universes Tiebreaking

Jun Wang, Sujoy Sikdar, Tyler Shepherd et al.

STV and ranked pairs (RP) are two well-studied voting rules for group decision-making. They proceed in multiple rounds, and are affected by how ties are broken in each round. However, the literature is surprisingly vague about how ties should be broken. We propose the first algorithms for computing the set of alternatives that are winners under some tiebreaking mechanism under STV and RP, which is also known as parallel-universes tiebreaking (PUT). Unfortunately, PUT-winners are NP-complete to compute under STV and RP, and standard search algorithms from AI do not apply. We propose multiple DFS-based algorithms along with pruning strategies and heuristics to prioritize search direction to significantly improve the performance using machine learning. We also propose novel ILP formulations for PUT-winners under STV and RP, respectively. Experiments on synthetic and real-world data show that our algorithms are overall significantly faster than ILP, while there are a few cases where ILP is significantly faster for RP.

CRMay 15, 2018

How Private Are Commonly-Used Voting Rules?

Ao Liu, Yun Lu, Lirong Xia et al.

Differential privacy has been widely applied to provide privacy guarantees by adding random noise to the function output. However, it inevitably fails in many high-stakes voting scenarios, where voting rules are required to be deterministic. In this work, we present the first framework for answering the question: "How private are commonly-used voting rules?" Our answers are two-fold. First, we show that deterministic voting rules provide sufficient privacy in the sense of distributional differential privacy (DDP). We show that assuming the adversarial observer has uncertainty about individual votes, even publishing the histogram of votes achieves good DDP. Second, we introduce the notion of exact privacy to compare the privacy preserved in various commonly-studied voting rules, and obtain dichotomy theorems of exact DDP within a large subset of voting rules called generalized scoring rules.

LGMay 14, 2018

A Cost-Effective Framework for Preference Elicitation and Aggregation

Zhibing Zhao, Haoming Li, Junming Wang et al.

We propose a cost-effective framework for preference elicitation and aggregation under the Plackett-Luce model with features. Given a budget, our framework iteratively computes the most cost-effective elicitation questions in order to help the agents make a better group decision. We illustrate the viability of the framework with experiments on Amazon Mechanical Turk, which we use to estimate the cost of answering different types of elicitation questions. We compare the prediction accuracy of our framework when adopting various information criteria that evaluate the expected information gain from a question. Our experiments show carefully designed information criteria are much more efficient, i.e., they arrive at the correct answer using fewer queries, than randomly asking questions given the budget constraint.

LGMar 23, 2016

Learning Mixtures of Plackett-Luce Models

Zhibing Zhao, Peter Piech, Lirong Xia

In this paper we address the identifiability and efficient learning problems of finite mixtures of Plackett-Luce models for rank data. We prove that for any $k\geq 2$, the mixture of $k$ Plackett-Luce models for no more than $2k-1$ alternatives is non-identifiable and this bound is tight for $k=2$. For generic identifiability, we prove that the mixture of $k$ Plackett-Luce models over $m$ alternatives is generically identifiable if $k\leq\lfloor\frac {m-2} 2\rfloor!$. We also propose an efficient generalized method of moments (GMM) algorithm to learn the mixture of two Plackett-Luce models and show that the algorithm is consistent. Our experiments show that our GMM algorithm is significantly faster than the EMM algorithm by Gormley and Murphy (2008), while achieving competitive statistical efficiency.

AINov 26, 2015

Welfare of Sequential Allocation Mechanisms for Indivisible Goods

Haris Aziz, Thomas Kalinowski, Toby Walsh et al.

Sequential allocation is a simple and attractive mechanism for the allocation of indivisible goods. Agents take turns, according to a policy, to pick items. Sequential allocation is guaranteed to return an allocation which is efficient but may not have an optimal social welfare. We consider therefore the relation between welfare and efficiency. We study the (computational) questions of what welfare is possible or necessary depending on the choice of policy. We also consider a novel control problem in which the chair chooses a policy to improve social welfare.

GTApr 22, 2015

Allocating Indivisible Items in Categorized Domains

Erika Mackin, Lirong Xia

We formulate a general class of allocation problems called categorized domain allocation problems (CDAPs), where indivisible items from multiple categories are allocated to agents without monetary transfer and each agent gets at least one item per category. We focus on basic CDAPs, where the number of items in each category is equal to the number of agents. We characterize serial dictatorships for basic CDAPs by a minimal set of three axiomatic properties: strategy-proofness, non-bossiness, and category-wise neutrality. Then, we propose a natural extension of serial dictatorships called categorial sequential allocation mechanisms (CSAMs), which allocate the items in multiple rounds: in each round, the active agent chooses an item from a designated category. We fully characterize the worst-case rank efficiency of CSAMs for optimistic and pessimistic agents, and provide a bound for strategic agents. We also conduct experiments to compare expected rank efficiency of various CSAMs w.r.t. random generated data.

AIDec 6, 2014

Possible and Necessary Allocations via Sequential Mechanisms

Haris Aziz, Toby Walsh, Lirong Xia

A simple mechanism for allocating indivisible resources is sequential allocation in which agents take turns to pick items. We focus on possible and necessary allocation problems, checking whether allocations of a given form occur in some or all mechanisms for several commonly used classes of sequential allocation mechanisms. In particular, we consider whether a given agent receives a given item, a set of items, or a subset of items for five natural classes of sequential allocation mechanisms: balanced, recursively balanced, balanced alternating, strictly alternating and all policies. We identify characterizations of allocations produced balanced, recursively balanced, balanced alternating policies and strictly alternating policies respectively, which extend the well-known characterization by Brams and King [2005] for policies without restrictions. In addition, we examine the computational complexity of possible and necessary allocation problems for these classes.

AIOct 29, 2014

A Statistical Decision-Theoretic Framework for Social Choice

Hossein Azari Soufiani, David C. Parkes, Lirong Xia

In this paper, we take a statistical decision-theoretic viewpoint on social choice, putting a focus on the decision to be made on behalf of a system of agents. In our framework, we are given a statistical ranking model, a decision space, and a loss function defined on (parameter, decision) pairs, and formulate social choice mechanisms as decision rules that minimize expected loss. This suggests a general framework for the design and analysis of new social choice mechanisms. We compare Bayesian estimators, which minimize Bayesian expected loss, for the Mallows model and the Condorcet model respectively, and the Kemeny rule. We consider various normative properties, in addition to computational complexity and asymptotic behavior. In particular, we show that the Bayesian estimator for the Condorcet model satisfies some desired properties such as anonymity, neutrality, and monotonicity, can be computed in polynomial time, and is asymptotically different from the other two rules when the data are generated from the Condorcet model for some ground truth parameter.

AISep 26, 2013

Preference Elicitation For General Random Utility Models

Hossein Azari Soufiani, David C. Parkes, Lirong Xia

This paper discusses {General Random Utility Models (GRUMs)}. These are a class of parametric models that generate partial ranks over alternatives given attributes of agents and alternatives. We propose two preference elicitation scheme for GRUMs developed from principles in Bayesian experimental design, one for social choice and the other for personalized choice. We couple this with a general Monte-Carlo-Expectation-Maximization (MC-EM) based algorithm for MAP inference under GRUMs. We also prove uni-modality of the likelihood functions for a class of GRUMs. We examine the performance of various criteria by experimental studies, which show that the proposed elicitation scheme increases the precision of estimation.

AIApr 5, 2012

How Many Vote Operations Are Needed to Manipulate A Voting System?

Lirong Xia

In this paper, we propose a framework to study a general class of strategic behavior in voting, which we call vote operations. We prove the following theorem: if we fix the number of alternatives, generate $n$ votes i.i.d. according to a distribution $π$, and let $n$ go to infinity, then for any $ε>0$, with probability at least $1-ε$, the minimum number of operations that are needed for the strategic individual to achieve her goal falls into one of the following four categories: (1) 0, (2) $Θ(\sqrt n)$, (3) $Θ(n)$, and (4) $\infty$. This theorem holds for any set of vote operations, any individual vote distribution $π$, and any integer generalized scoring rule, which includes (but is not limited to) almost all commonly studied voting rules, e.g., approval voting, all positional scoring rules (including Borda, plurality, and veto), plurality with runoff, Bucklin, Copeland, maximin, STV, and ranked pairs. We also show that many well-studied types of strategic behavior fall under our framework, including (but not limited to) constructive/destructive manipulation, bribery, and control by adding/deleting votes, margin of victory, and minimum manipulation coalition size. Therefore, our main theorem naturally applies to these problems.

AIMar 14, 2012

Combining Voting Rules Together

Nina Narodytska, Toby Walsh, Lirong Xia

We propose a simple method for combining together voting rules that performs a run-off between the different winners of each voting rule. We prove that this combinator has several good properties. For instance, even if just one of the base voting rules has a desirable property like Condorcet consistency, the combination inherits this property. In addition, we prove that combining voting rules together in this way can make finding a manipulation more computationally difficult. Finally, we study the impact of this combinator on approximation methods that find close to optimal manipulations.

GTFeb 14, 2012

Price Updating in Combinatorial Prediction Markets with Bayesian Networks

David M. Pennock, Lirong Xia

To overcome the #P-hardness of computing/updating prices in logarithm market scoring rule-based (LMSR-based) combinatorial prediction markets, Chen et al. [5] recently used a simple Bayesian network to represent the prices of securities in combinatorial predictionmarkets for tournaments, and showed that two types of popular securities are structure preserving. In this paper, we significantly extend this idea by employing Bayesian networks in general combinatorial prediction markets. We reveal a very natural connection between LMSR-based combinatorial prediction markets and probabilistic belief aggregation,which leads to a complete characterization of all structure preserving securities for decomposable network structures. Notably, the main results by Chen et al. [5] are corollaries of our characterization. We then prove that in order for a very basic set of securities to be structure preserving, the graph of the Bayesian network must be decomposable. We also discuss some approximation techniques for securities that are not structure preserving.