Nicholas Mattei

AI
h-index8
49papers
2,581citations
Novelty42%
AI Score55

49 Papers

29.7CLJun 2
Using Text-Based Causal Inference to Disentangle Factors Influencing Online Review Ratings

Linsen Li, Aron Culotta, Nicholas Mattei

Online reviews provide valuable insights into the perceived quality of facets of a product or service. While aspect-based sentiment analysis has focused on extracting these facets from reviews, there is less work understanding the impact of each aspect on overall perception. This is particularly challenging given correlations among aspects, making it difficult to isolate the effects of each. This paper introduces a methodology based on recent advances in text-based causal analysis, specifically CausalBERT, to disentangle the effect of each factor on overall review ratings. We enhance CausalBERT with three key improvements: temperature scaling for better calibrated treatment assignment estimates; hyperparameter optimization to reduce confound overadjustment; and interpretability methods to characterize discovered confounds. In this work, we treat the textual mentions in reviews as proxies for real-world attributes. We validate our approach on real and semi-synthetic data from over 600K reviews of U.S. K-12 schools. We find that the proposed enhancements result in more reliable estimates, and that perception of school administration and performance on benchmarks are significant drivers of overall school ratings.

AIMar 2, 2023
Dynamic fairness-aware recommendation through multi-agent social choice

Amanda Aird, Paresha Farastu, Joshua Sun et al.

Algorithmic fairness in the context of personalized recommendation presents significantly different challenges to those commonly encountered in classification tasks. Researchers studying classification have generally considered fairness to be a matter of achieving equality of outcomes between a protected and unprotected group, and built algorithmic interventions on this basis. We argue that fairness in real-world application settings in general, and especially in the context of personalized recommendation, is much more complex and multi-faceted, requiring a more general approach. We propose a model to formalize multistakeholder fairness in recommender systems as a two stage social choice problem. In particular, we express recommendation fairness as a novel combination of an allocation and an aggregation problem, which integrate both fairness concerns and personalized recommendation provisions, and derive new recommendation techniques based on this formulation. Simulations demonstrate the ability of the framework to integrate multiple fairness concerns in a dynamic way.

IRSep 8, 2022
Who Pays? Personalization, Bossiness and the Cost of Fairness

Paresha Farastu, Nicholas Mattei, Robin Burke

Fairness-aware recommender systems that have a provider-side fairness concern seek to ensure that protected group(s) of providers have a fair opportunity to promote their items or products. There is a ``cost of fairness'' borne by the consumer side of the interaction when such a solution is implemented. This consumer-side cost raises its own questions of fairness, particularly when personalization is used to control the impact of the fairness constraint. In adopting a personalized approach to the fairness objective, researchers may be opening their systems up to strategic behavior on the part of users. This type of incentive has been studied in the computational social choice literature under the terminology of ``bossiness''. The concern is that a bossy user may be able to shift the cost of fairness to others, improving their own outcomes and worsening those for others. This position paper introduces the concept of bossiness, shows its application in fairness-aware recommendation and discusses strategies for reducing this strategic incentive.

LGNov 15, 2022
Who Reviews The Reviewers? A Multi-Level Jury Problem

Ben Abramowitz, Omer Lev, Nicholas Mattei

We consider the problem of determining a binary ground truth using advice from a group of independent reviewers (experts) who express their guess about a ground truth correctly with some independent probability (competence). In this setting, when all reviewers are competent (competence greater than one-half), the Condorcet Jury Theorem tells us that adding more reviewers increases the overall accuracy, and if all competences are known, then there exists an optimal weighting of the reviewers. However, in practical settings, reviewers may be noisy or incompetent, i.e., competence below half, and the number of experts may be small, so the asymptotic Condorcet Jury Theorem is not practically relevant. In such cases we explore appointing one or more chairs (judges) who determine the weight of each reviewer for aggregation, creating multiple levels. However, these chairs may be unable to correctly identify the competence of the reviewers they oversee, and therefore unable to compute the optimal weighting. We give conditions when a set of chairs is able to weight the reviewers optimally, and depending on the competence distribution of the agents, give results about when it is better to have more chairs or more reviewers. Through numerical simulations we show that in some cases it is better to have more chairs, but in many cases it is better to have more reviewers.

MANov 18, 2022
Pandering in a Flexible Representative Democracy

Xiaolin Sun, Jacob Masur, Ben Abramowitz et al.

In representative democracies, the election of new representatives in regular election cycles is meant to prevent corruption and other misbehavior by elected officials and to keep them accountable in service of the ``will of the people." This democratic ideal can be undermined when candidates are dishonest when campaigning for election over these multiple cycles or rounds of voting. Much of the work on COMSOC to date has investigated strategic actions in only a single round. We introduce a novel formal model of \emph{pandering}, or strategic preference reporting by candidates seeking to be elected, and examine the resilience of two democratic voting systems to pandering within a single round and across multiple rounds. The two voting systems we compare are Representative Democracy (RD) and Flexible Representative Democracy (FRD). For each voting system, our analysis centers on the types of strategies candidates employ and how voters update their views of candidates based on how the candidates have pandered in the past. We provide theoretical results on the complexity of pandering in our setting for a single cycle, formulate our problem for multiple cycles as a Markov Decision Process, and use reinforcement learning to study the effects of pandering by both single candidates and groups of candidates across a number of rounds.

AIJul 14, 2023
Value-based Fast and Slow AI Nudging

Marianna B. Ganapini, Francesco Fabiano, Lior Horesh et al.

Nudging is a behavioral strategy aimed at influencing people's thoughts and actions. Nudging techniques can be found in many situations in our daily lives, and these nudging techniques can targeted at human fast and unconscious thinking, e.g., by using images to generate fear or the more careful and effortful slow thinking, e.g., by releasing information that makes us reflect on our choices. In this paper, we propose and discuss a value-based AI-human collaborative framework where AI systems nudge humans by proposing decision recommendations. Three different nudging modalities, based on when recommendations are presented to the human, are intended to stimulate human fast thinking, slow thinking, or meta-cognition. Values that are relevant to a specific decision scenario are used to decide when and how to use each of these nudging modalities. Examples of values are decision quality, speed, human upskilling and learning, human agency, and privacy. Several values can be present at the same time, and their priorities can vary over time. The framework treats values as parameters to be instantiated in a specific decision environment.

LGJun 3, 2022
Towards Group Learning: Distributed Weighting of Experts

Ben Abramowitz, Nicholas Mattei

Aggregating signals from a collection of noisy sources is a fundamental problem in many domains including crowd-sourcing, multi-agent planning, sensor networks, signal processing, voting, ensemble learning, and federated learning. The core question is how to aggregate signals from multiple sources (e.g. experts) in order to reveal an underlying ground truth. While a full answer depends on the type of signal, correlation of signals, and desired output, a problem common to all of these applications is that of differentiating sources based on their quality and weighting them accordingly. It is often assumed that this differentiation and aggregation is done by a single, accurate central mechanism or agent (e.g. judge). We complicate this model in two ways. First, we investigate the setting with both a single judge, and one with multiple judges. Second, given this multi-agent interaction of judges, we investigate various constraints on the judges' reporting space. We build on known results for the optimal weighting of experts and prove that an ensemble of sub-optimal mechanisms can perform optimally under certain conditions. We then show empirically that the ensemble approximates the performance of the optimal mechanism under a broader range of conditions.

28.3GTMay 22
Analyzing the Effects of Two-Stage Peer Evaluation

Roy Fairstein, Harper Lyon, Oshri Damty et al.

Peer-evaluation and selection systems are used when sets of agents evaluate each other in order to select the best $k$ among them. These are commonly used in real-world settings, including academic conferences where those reviewing papers are often the set of submitters. Conferences have attempted to better allocate their reviewing resources by moving to a two-stage mechanism, in which some papers are eliminated after a first stage of review and remaining papers receive additional reviewers. We investigate how two major strategyproof peer selection mechanisms, Partition and ExactDollarPartition, perform when adapted to a two-stage system, in order to try and understand the effect of the two-stage mechanism on which agents get selected. We also examine how the various parameters of the two-stage mechanism influence the outcome. We provide a theoretical basis by showing how a particular setting is influenced by the two stages. However, solving for the general case seems implausible at the moment, and we use extensive simulations of different scenarios and settings to observe which agents benefit and which are harmed by adopting two-stage mechanisms (and we vary this mechanisms parameters as well). We show that the two-stage mechanism's advantage depends the noisiness of reviewer beliefs. Borderline agents benefit most in a low noise environment, while high rank agents benefit more in noisy environments. We show that the effectiveness of these mechanisms is highly dependent on the number of chosen agents, the number of reviews requested from agents, and reviewers' correlation, indicating that organizers need to exercise caution when selecting these parameters for a reviewing process.

41.6GTMay 22
PeerBTS: Incentivizing Effort in Strategyproof Peer Selection

Harper Lyon, Omer Lev, Nicholas Mattei

Peer selection, the evaluation and selection of agents by their peers, is an important problem in the field of computational social choice; with applications to grading in massively online courses (MOOCs) and academic peer review. Current existing algorithmic and empirical work focuses on developing and analyzing novel \emph{strategyproof} mechanisms, wherein no agent has an incentive to misreport their evaluations. However, the majority of published mechanisms share a flaw: they do not \emph{reward} agents for any effort expended during the evaluation process. In cases where high quality evaluations are costly to produce this missing incentive fails to align agents with an overall goal of accurate selection. To address this gap we first prove theoretically that incentivizing effort in peer selection requires information beyond a single evaluation. We then propose \textsc{PeerBTS}, a mechanism that combines a peer-prediction lottery, leveraging work on the Robust Bayesian Truth Serum, with any existing peer-selection mechanism to incentivize effort while remaining Bayes-Nash incentive compatible. We find that while an incentive-compatible peer-selection mechanism using agent predictions to incentivize effort is possible it requires adjustments to the assumed problem context and limits other mechanistics properties. We additionally present a series of non-strategic simulations to validate incentives and evaluate the performance of PeerBTS relative to existing strategyproof peer selection mechanisms. Finally, we discuss the results of an initial study on the validity of peer-prediction from a small academic workshop.

MAAug 24, 2024
DeepVoting: Learning and Fine-Tuning Voting Rules with Canonical Embeddings

Leonardo Matone, Ben Abramowitz, Ben Armstrong et al.

Aggregating agent preferences into a collective decision is an important step in many problems (e.g., hiring, elections, peer review) and across areas of computer science (e.g., reinforcement learning, recommender systems). As Social Choice Theory has shown, the problem of designing aggregation rules with specific sets of properties (axioms) can be difficult, or provably impossible in some cases. Instead of designing algorithms by hand, one can learn aggregation rules, particularly voting rules, from data. However, prior work in this area has required extremely large models or been limited by the choice of preference representation, i.e., embedding. We recast the problem of designing voting rules with desirable properties into one of learning probabilistic functions that output distributions over a set of candidates. Specifically, we use neural networks to learn probabilistic social choice functions. Using standard embeddings from the social choice literature we show that preference profile encoding has significant impact on the efficiency and ability of neural networks to learn rules, allowing us to learn rules faster and with smaller networks than previous work. Moreover, we show that our learned rules can be fine-tuned using axiomatic properties to create novel voting rules and make them resistant to specific types of "attack". Namely, we fine-tune rules to resist a probabilistic version of the No Show Paradox.

GTNov 15, 2022
Social Mechanism Design: Making Maximally Acceptable Decisions

Ben Abramowitz, Nicholas Mattei

Agents care not only about the outcomes of collective decisions but also about how decisions are made. In many cases, both the outcome and the procedure affect whether agents see a decision as legitimate, justifiable, or acceptable. We propose a novel model for collective decisions that takes into account both the preferences of the agents and their higher order concerns about the process of preference aggregation. To this end we (1) propose natural, plausible preference structures and establish key properties thereof, (2) develop mechanisms for aggregating these preferences to maximize the acceptability of decisions, and (3) characterize the performance of our acceptance-maximizing mechanisms. We apply our general approach to the specific setting of dichotomous choice, and compare the worst-case rates of acceptance achievable among populations of agents of different types. We also show in the special case of rule selection, i.e., amendment procedures, the method proposed by Abramowitz, Shapiro, and Talmon (2021) achieves universal acceptance with certain agent types.

29.4HCMay 12
Co-Designing Organizational Justice Indicators for Algorithmic Systems

Fujiko Robledo Yamamoto, Nicholas Mattei, Pradeep Ragothaman et al.

Fairness in machine learning is often conceptualized narrowly in comparative, distributional terms. In studying stakeholders' concepts of fairness, we find that this framing is insufficient to capture the full range of issues raised. As an alternative, we propose organizational justice as a framework that subsumes distributional fairness as well as other normative concerns. We conduct a case study of organizational justice relative to personalized recommendation in the context of Kiva Microfunds, a nonprofit micro-lending organization whose mission is to increase financial access for underserved communities across the world. We report on the results of co-design workshops conducted with Kiva employees who are involved in different departments and whose roles often lead them to prioritize normative concerns that are most supportive of the stakeholders with whom they work most closely. We apply organizational justice to understand design trade-offs among different normative goals stakeholders invoke. Based on these goals, we identify a suite of metrics that Kiva employees can use to monitor and assess the recommender system's impact on their organizational justice concerns and to seed discussions within the organization about appropriate configuration and deployment of this system in context.

IRSep 10, 2023
Exploring Social Choice Mechanisms for Recommendation Fairness in SCRUF

Amanda Aird, Cassidy All, Paresha Farastu et al.

Fairness problems in recommender systems often have a complexity in practice that is not adequately captured in simplified research formulations. A social choice formulation of the fairness problem, operating within a multi-agent architecture of fairness concerns, offers a flexible and multi-aspect alternative to fairness-aware recommendation approaches. Leveraging social choice allows for increased generality and the possibility of tapping into well-studied social choice algorithms for resolving the tension between multiple, competing fairness concerns. This paper explores a range of options for choice mechanisms in multi-aspect fairness applications using both real and synthetic data and shows that different classes of choice and allocation mechanisms yield different but consistent fairness / accuracy tradeoffs. We also show that a multi-agent formulation offers flexibility in adapting to user population dynamics.

CLOct 14, 2025
Investigating Political and Demographic Associations in Large Language Models Through Moral Foundations Theory

Nicole Smith-Vaniz, Harper Lyon, Lorraine Steigner et al.

Large Language Models (LLMs) have become increasingly incorporated into everyday life for many internet users, taking on significant roles as advice givers in the domains of medicine, personal relationships, and even legal matters. The importance of these roles raise questions about how and what responses LLMs make in difficult political and moral domains, especially questions about possible biases. To quantify the nature of potential biases in LLMs, various works have applied Moral Foundations Theory (MFT), a framework that categorizes human moral reasoning into five dimensions: Harm, Fairness, Ingroup Loyalty, Authority, and Purity. Previous research has used the MFT to measure differences in human participants along political, national, and cultural lines. While there has been some analysis of the responses of LLM with respect to political stance in role-playing scenarios, no work so far has directly assessed the moral leanings in the LLM responses, nor have they connected LLM outputs with robust human data. In this paper we analyze the distinctions between LLM MFT responses and existing human research directly, investigating whether commonly available LLM responses demonstrate ideological leanings: either through their inherent responses, straightforward representations of political ideologies, or when responding from the perspectives of constructed human personas. We assess whether LLMs inherently generate responses that align more closely with one political ideology over another, and additionally examine how accurately LLMs can represent ideological perspectives through both explicit prompting and demographic-based role-playing. By systematically analyzing LLM behavior across these conditions and experiments, our study provides insight into the extent of political and demographic dependency in AI-generated responses.

AISep 26, 2025
Axiomatic Choice and the Decision-Evaluation Paradox

Ben Abramowitz, Nicholas Mattei

We introduce a framework for modeling decisions with axioms that are statements about decisions, e.g., ethical constraints. Using our framework we define a taxonomy of decision axioms based on their structural properties and demonstrate a tension between the use of axioms to make decisions and the use of axioms to evaluate decisions which we call the Decision-Evaluation Paradox. We argue that the Decision-Evaluation Paradox arises with realistic axiom structures, and the paradox illuminates why one must be exceptionally careful when training models on decision data or applying axioms to make and evaluate decisions.

IRSep 10, 2025
Envy-Free but Still Unfair: Envy-Freeness Up To One Item (EF-1) in Personalized Recommendation

Amanda Aird, Ben Armstrong, Nicholas Mattei et al.

Envy-freeness and the relaxation to Envy-freeness up to one item (EF-1) have been used as fairness concepts in the economics, game theory, and social choice literatures since the 1960s, and have recently gained popularity within the recommendation systems communities. In this short position paper we will give an overview of envy-freeness and its use in economics and recommendation systems; and illustrate why envy is not appropriate to measure fairness for use in settings where personalization plays a role.

AIJul 2, 2025
The Illusion of Fairness: Auditing Fairness Interventions with Audit Studies

Disa Sariola, Patrick Button, Aron Culotta et al.

Artificial intelligence systems, especially those using machine learning, are being deployed in domains from hiring to loan issuance in order to automate these complex decisions. Judging both the effectiveness and fairness of these AI systems, and their human decision making counterpart, is a complex and important topic studied across both computational and social sciences. Within machine learning, a common way to address bias in downstream classifiers is to resample the training data to offset disparities. For example, if hiring rates vary by some protected class, then one may equalize the rate within the training set to alleviate bias in the resulting classifier. While simple and seemingly effective, these methods have typically only been evaluated using data obtained through convenience samples, introducing selection bias and label bias into metrics. Within the social sciences, psychology, public health, and medicine, audit studies, in which fictitious ``testers'' (e.g., resumes, emails, patient actors) are sent to subjects (e.g., job openings, businesses, doctors) in randomized control trials, provide high quality data that support rigorous estimates of discrimination. In this paper, we investigate how data from audit studies can be used to improve our ability to both train and evaluate automated hiring algorithms. We find that such data reveals cases where the common fairness intervention method of equalizing base rates across classes appears to achieve parity using traditional measures, but in fact has roughly 10% disparity when measured appropriately. We additionally introduce interventions based on individual treatment effect estimation methods that further reduce algorithmic discrimination using this data.

LGJun 17, 2025
Fair Algorithms with Probing for Multi-Agent Multi-Armed Bandits

Tianyi Xu, Jiaxin Liu, Nicholas Mattei et al.

We propose a multi-agent multi-armed bandit (MA-MAB) framework aimed at ensuring fair outcomes across agents while maximizing overall system performance. A key challenge in this setting is decision-making under limited information about arm rewards. To address this, we introduce a novel probing framework that strategically gathers information about selected arms before allocation. In the offline setting, where reward distributions are known, we leverage submodular properties to design a greedy probing algorithm with a provable performance bound. For the more complex online setting, we develop an algorithm that achieves sublinear regret while maintaining fairness. Extensive experiments on synthetic and real-world datasets show that our approach outperforms baseline methods, achieving better fairness and efficiency.

LGFeb 21, 2022
Learning Behavioral Soft Constraints from Demonstrations

Arie Glazier, Andrea Loreggia, Nicholas Mattei et al.

Many real-life scenarios require humans to make difficult trade-offs: do we always follow all the traffic rules or do we violate the speed limit in an emergency? These scenarios force us to evaluate the trade-off between collective rules and norms with our own personal objectives and desires. To create effective AI-human teams, we must equip AI agents with a model of how humans make these trade-offs in complex environments when there are implicit and explicit rules and constraints. Agent equipped with these models will be able to mirror human behavior and/or to draw human attention to situations where decision making could be improved. To this end, we propose a novel inverse reinforcement learning (IRL) method: Max Entropy Inverse Soft Constraint IRL (MESC-IRL), for learning implicit hard and soft constraints over states, actions, and state features from demonstrations in deterministic and non-deterministic environments modeled as Markov Decision Processes (MDPs). Our method enables agents implicitly learn human constraints and desires without the need for explicit modeling by the agent designer and to transfer these constraints between environments. Our novel method generalizes prior work which only considered deterministic hard constraints and achieves state of the art performance.

AIJan 19, 2022
When Is It Acceptable to Break the Rules? Knowledge Representation of Moral Judgement Based on Empirical Data

Edmond Awad, Sydney Levine, Andrea Loreggia et al.

One of the most remarkable things about the human moral mind is its flexibility. We can make moral judgments about cases we have never seen before. We can decide that pre-established rules should be broken. We can invent novel rules on the fly. Capturing this flexibility is one of the central challenges in developing AI systems that can interpret and produce human-like moral judgment. This paper details the results of a study of real-world decision makers who judge whether it is acceptable to break a well-established norm: ``no cutting in line.'' We gather data on how human participants judge the acceptability of line-cutting in a range of scenarios. Then, in order to effectively embed these reasoning capabilities into a machine, we propose a method for modeling them using a preference-based structure, which captures a novel modification to standard ``dual process'' theories of moral judgment.

AIJan 18, 2022
Combining Fast and Slow Thinking for Human-like and Efficient Navigation in Constrained Environments

Marianna B. Ganapini, Murray Campbell, Francesco Fabiano et al.

Current AI systems lack several important human capabilities, such as adaptability, generalizability, self-control, consistency, common sense, and causal reasoning. We believe that existing cognitive theories of human decision making, such as the thinking fast and slow theory, can provide insights on how to advance AI systems towards some of these capabilities. In this paper, we propose a general architecture that is based on fast/slow solvers and a metacognitive component. We then present experimental results on the behavior of an instance of this architecture, for AI systems that make decisions about navigating in a constrained environment. We show how combining the fast and slow decision modalities allows the system to evolve over time and gradually pass from slow to fast thinking with enough experience, and that this greatly helps in decision quality, resource consumption, and efficiency.

AIOct 5, 2021
Thinking Fast and Slow in AI: the Role of Metacognition

Marianna Bergamaschi Ganapini, Murray Campbell, Francesco Fabiano et al.

AI systems have seen dramatic advancement in recent years, bringing many applications that pervade our everyday life. However, we are still mostly seeing instances of narrow AI: many of these recent developments are typically focused on a very limited set of competencies and goals, e.g., image interpretation, natural language processing, classification, prediction, and many others. Moreover, while these successes can be accredited to improved algorithms and techniques, they are also tightly linked to the availability of huge datasets and computational power. State-of-the-art AI still lacks many capabilities that would naturally be included in a notion of (human) intelligence. We argue that a better study of the mechanisms that allow humans to have these capabilities can help us understand how to imbue AI systems with these competencies. We focus especially on D. Kahneman's theory of thinking fast and slow, and we propose a multi-agent AI architecture where incoming problems are solved by either system 1 (or "fast") agents, that react by exploiting only past experience, or by system 2 (or "slow") agents, that are deliberately activated when there is the need to reason and search for optimal solutions beyond what is expected from the system 1 agent. Both kinds of agents are supported by a model of the world, containing domain knowledge about the environment, and a model of "self", containing information about past actions of the system and solvers' skills.

AISep 22, 2021
Making Human-Like Trade-offs in Constrained Environments by Learning from Demonstrations

Arie Glazier, Andrea Loreggia, Nicholas Mattei et al.

Many real-life scenarios require humans to make difficult trade-offs: do we always follow all the traffic rules or do we violate the speed limit in an emergency? These scenarios force us to evaluate the trade-off between collective norms and our own personal objectives. To create effective AI-human teams, we must equip AI agents with a model of how humans make trade-offs in complex, constrained environments. These agents will be able to mirror human behavior or to draw human attention to situations where decision making could be improved. To this end, we propose a novel inverse reinforcement learning (IRL) method for learning implicit hard and soft constraints from demonstrations, enabling agents to quickly adapt to new settings. In addition, learning soft constraints over states, actions, and state features allows agents to transfer this knowledge to new domains that share similar aspects. We then use the constraint learning method to implement a novel system architecture that leverages a cognitive model of human decision making, multi-alternative decision field theory (MDFT), to orchestrate competing objectives. We evaluate the resulting agent on trajectory length, number of violated constraints, and total reward, demonstrating that our agent architecture is both general and achieves strong performance. Thus we are able to capture and replicate human-like trade-offs from demonstrations in environments when constraints are not explicit.

GTJul 21, 2021
Peer Selection with Noisy Assessments

Omer Lev, Nicholas Mattei, Paolo Turrini et al.

In the peer selection problem a group of agents must select a subset of themselves as winners for, e.g., peer-reviewed grants or prizes. Here, we take a Condorcet view of this aggregation problem, i.e., that there is a ground-truth ordering over the agents and we wish to select the best set of agents, subject to the noisy assessments of the peers. Given this model, some agents may be unreliable, while others might be self-interested, attempting to influence the outcome in their favour. In this paper we extend PeerNomination, the most accurate peer reviewing algorithm to date, into WeightedPeerNomination, which is able to handle noisy and inaccurate agents. To do this, we explicitly formulate assessors' reliability weights in a way that does not violate strategyproofness, and use this information to reweight their scores. We show analytically that a weighting scheme can improve the overall accuracy of the selection significantly. Finally, we implement several instances of reweighting methods and show empirically that our methods are robust in the face of noisy assessments.

GTDec 4, 2020
Modeling Voters in Multi-Winner Approval Voting

Jaelle Scheuerman, Jason Harman, Nicholas Mattei et al.

In many real world situations, collective decisions are made using voting and, in scenarios such as committee or board elections, employing voting rules that return multiple winners. In multi-winner approval voting (AV), an agent submits a ballot consisting of approvals for as many candidates as they wish, and winners are chosen by tallying up the votes and choosing the top-$k$ candidates receiving the most approvals. In many scenarios, an agent may manipulate the ballot they submit in order to achieve a better outcome by voting in a way that does not reflect their true preferences. In complex and uncertain situations, agents may use heuristics instead of incurring the additional effort required to compute the manipulation which most favors them. In this paper, we examine voting behavior in single-winner and multi-winner approval voting scenarios with varying degrees of uncertainty using behavioral data obtained from Mechanical Turk. We find that people generally manipulate their vote to obtain a better outcome, but often do not identify the optimal manipulation. There are a number of predictive models of agent behavior in the COMSOC and psychology literature that are based on cognitively plausible heuristic strategies. We show that the existing approaches do not adequately model real-world data. We propose a novel model that takes into account the size of the winning set and human cognitive constraints, and demonstrate that this model is more effective at capturing real-world behaviors in multi-winner approval voting scenarios.

AIOct 12, 2020
Thinking Fast and Slow in AI

Grady Booch, Francesco Fabiano, Lior Horesh et al.

This paper proposes a research direction to advance AI which draws inspiration from cognitive theories of human decision making. The premise is that if we gain insights about the causes of some human capabilities that are still lacking in AI (for instance, adaptability, generalizability, common sense, and causal reasoning), we may obtain similar capabilities in an AI system by embedding these causal components. We hope that the high-level description of our vision included in this paper, as well as the several research questions that we propose to consider, can stimulate the AI research community to define, try and evaluate new methodologies, frameworks, and evaluation metrics, in the spirit of achieving a better understanding of both human and machine intelligence.

IRSep 5, 2020
"And the Winner Is...": Dynamic Lotteries for Multi-group Fairness-Aware Recommendation

Nasim Sonboli, Robin Burke, Nicholas Mattei et al.

As recommender systems are being designed and deployed for an increasing number of socially-consequential applications, it has become important to consider what properties of fairness these systems exhibit. There has been considerable research on recommendation fairness. However, we argue that the previous literature has been based on simple, uniform and often uni-dimensional notions of fairness assumptions that do not recognize the real-world complexities of fairness-aware applications. In this paper, we explicitly represent the design decisions that enter into the trade-off between accuracy and fairness across multiply-defined and intersecting protected groups, supporting multiple fairness metrics. The framework also allows the recommender to adjust its performance based on the historical view of recommendations that have been delivered over a time horizon, dynamically rebalancing between fairness concerns. Within this framework, we formulate lottery-based mechanisms for choosing between fairness concerns, and demonstrate their performance in two recommendation domains.

GTApr 30, 2020
PeerNomination: Relaxing Exactness for Increased Accuracy in Peer Selection

Nicholas Mattei, Paolo Turrini, Stanislav Zhydkov

In peer selection agents must choose a subset of themselves for an award or a prize. As agents are self-interested, we want to design algorithms that are impartial, so that an individual agent cannot affect their own chance of being selected. This problem has broad application in resource allocation and mechanism design and has received substantial attention in the artificial intelligence literature. Here, we present a novel algorithm for impartial peer selection, PeerNomination, and provide a theoretical analysis of its accuracy. Our algorithm possesses various desirable features. In particular, it does not require an explicit partitioning of the agents, as previous algorithms in the literature. We show empirically that it achieves higher accuracy than the exiting algorithms over several metrics.

LGFeb 21, 2020
A Multi-Channel Neural Graphical Event Model with Negative Evidence

Tian Gao, Dharmashankar Subramanian, Karthikeyan Shanmugam et al.

Event datasets are sequences of events of various types occurring irregularly over the time-line, and they are increasingly prevalent in numerous domains. Existing work for modeling events using conditional intensities rely on either using some underlying parametric form to capture historical dependencies, or on non-parametric models that focus primarily on tasks such as prediction. We propose a non-parametric deep neural network approach in order to estimate the underlying intensity functions. We use a novel multi-channel RNN that optimally reinforces the negative evidence of no observable events with the introduction of fake event epochs within each consecutive inter-event interval. We evaluate our method against state-of-the-art baselines on model fitting tasks as gauged by log-likelihood. Through experiments on both synthetic and real-world datasets, we find that our proposed approach outperforms existing baselines on most of the datasets studied.

LGDec 9, 2019
Group Fairness in Bandit Arm Selection

Candice Schumann, Zhi Lang, Nicholas Mattei et al.

We propose a novel formulation of group fairness with biased feedback in the contextual multi-armed bandit (CMAB) setting. In the CMAB setting, a sequential decision maker must, at each time step, choose an arm to pull from a finite set of arms after observing some context for each of the potential arm pulls. In our model, arms are partitioned into two or more sensitive groups based on some protected feature(s) (e.g., age, race, or socio-economic status). Initial rewards received from pulling an arm may be distorted due to some unknown societal or measurement bias. We assume that in reality these groups are equal despite the biased feedback received by the agent. To alleviate this, we learn a societal bias term which can be used to both find the source of bias and to potentially fix the problem outside of the algorithm. We provide a novel algorithm that can accommodate this notion of fairness for an arbitrary number of groups, and provide a theoretical bound on the regret for our algorithm. We validate our algorithm using synthetic data and two real-world datasets for intervention settings wherein we want to allocate resources fairly across groups.

GTNov 29, 2019
Heuristic Strategies in Uncertain Approval Voting Environments

Jaelle Scheuerman, Jason L. Harman, Nicholas Mattei et al.

In many collective decision making situations, agents vote to choose an alternative that best represents the preferences of the group. Agents may manipulate the vote to achieve a better outcome by voting in a way that does not reflect their true preferences. In real world voting scenarios, people often do not have complete information about other voter preferences and it can be computationally complex to identify a strategy that will maximize their expected utility. In such situations, it is often assumed that voters will vote truthfully rather than expending the effort to strategize. However, being truthful is just one possible heuristic that may be used. In this paper, we examine the effectiveness of heuristics in single winner and multi-winner approval voting scenarios with missing votes. In particular, we look at heuristics where a voter ignores information about other voting profiles and makes their decisions based solely on how much they like each candidate. In a behavioral experiment, we show that people vote truthfully in some situations and prioritize high utility candidates in others. We examine when these behaviors maximize expected utility and show how the structure of the voting environment affects both how well each heuristic performs and how humans employ these heuristics.

CLNov 5, 2019
Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Pavan Kapanipathi, Veronika Thost, Siva Sankalp Patel et al.

Textual entailment is a fundamental task in natural language processing. Most approaches for solving the problem use only the textual content present in training data. A few approaches have shown that information from external knowledge sources like knowledge graphs (KGs) can add value, in addition to the textual content, by providing background knowledge that may be critical for a task. However, the proposed models do not fully exploit the information in the usually large and noisy KGs, and it is not clear how it can be effectively encoded to be useful for entailment. We present an approach that complements text-based entailment models with information from KGs by (1) using Personalized PageR- ank to generate contextual subgraphs with reduced noise and (2) encoding these subgraphs using graph convolutional networks to capture KG structure. Our technique extends the capability of text models exploiting structural and semantic information found in KGs. We evaluate our approach on multiple textual entailment datasets and show that the use of external knowledge helps improve prediction accuracy. This is particularly evident in the challenging BreakingNLI dataset, where we see an absolute improvement of 5-20% over multiple text-based entailment models.

GTMay 28, 2019
Heuristics in Multi-Winner Approval Voting

Jaelle Scheuerman, Jason L. Harman, Nicholas Mattei et al.

In many real world situations, collective decisions are made using voting. Moreover, scenarios such as committee or board elections require voting rules that return multiple winners. In multi-winner approval voting (AV), an agent may vote for as many candidates as they wish. Winners are chosen by tallying up the votes and choosing the top-$k$ candidates receiving the most votes. An agent may manipulate the vote to achieve a better outcome by voting in a way that does not reflect their true preferences. In complex and uncertain situations, agents may use heuristics to strategize, instead of incurring the additional effort required to compute the manipulation which most favors them. In this paper, we examine voting behavior in multi-winner approval voting scenarios with complete information. We show that people generally manipulate their vote to obtain a better outcome, but often do not identify the optimal manipulation. Instead, voters tend to prioritize the candidates with the highest utilities. Using simulations, we demonstrate the effectiveness of these heuristics in situations where agents only have access to partial information.

AIDec 10, 2018
Building Ethically Bounded AI

Francesca Rossi, Nicholas Mattei

The more AI agents are deployed in scenarios with possibly unexpected situations, the more they need to be flexible, adaptive, and creative in achieving the goal we have given them. Thus, a certain level of freedom to choose the best path to the goal is inherent in making AI robust and flexible enough. At the same time, however, the pervasive deployment of AI in our life, whether AI is autonomous or collaborating with humans, raises several ethical challenges. AI agents should be aware and follow appropriate ethical principles and should thus exhibit properties such as fairness or other virtues. These ethical principles should define the boundaries of AI's freedom and creativity. However, it is still a challenge to understand how to specify and reason with ethical boundaries in AI agents and how to combine them appropriately with subjective preferences and goal specifications. Some initial attempts employ either a data-driven example-based approach for both, or a symbolic rule-based approach for both. We envision a modular approach where any AI technique can be used for any of these essential ingredients in decision making or decision support systems, paired with a contextual approach to define their combination and relative weight. In a world where neither humans nor AI systems work in isolation, but are tightly interconnected, e.g., the Internet of Things, we also envision a compositional approach to building ethically bounded AI, where the ethical properties of each component can be fruitfully exploited to derive those of the overall system. In this paper we define and motivate the notion of ethically-bounded AI, we describe two concrete examples, and we outline some outstanding challenges.

LGSep 21, 2018
CPMetric: Deep Siamese Networks for Learning Distances Between Structured Preferences

Andrea Loreggia, Nicholas Mattei, Francesca Rossi et al.

Preference are central to decision making by both machines and humans. Representing, learning, and reasoning with preferences is an important area of study both within computer science and across the sciences. When working with preferences it is necessary to understand and compute the distance between sets of objects, e.g., the preferences of a user and a the descriptions of objects to be recommended. We present CPDist, a novel neural network to address the problem of learning to measure the distance between structured preference representations. We use the popular CP-net formalism to represent preferences and then leverage deep neural networks to learn a recently proposed metric function that is computationally hard to compute directly. CPDist is a novel metric learning approach based on the use of deep siamese networks which learn the Kendal Tau distance between partial orders that are induced by compact preference representations. We find that CPDist is able to learn the distance function with high accuracy and outperform existing approximation algorithms on both the regression and classification task using less computation time. Performance remains good even when CPDist is trained with only a small number of samples compared to the dimension of the solution space, indicating the network generalizes well.

LGSep 21, 2018
Interpretable Multi-Objective Reinforcement Learning through Policy Orchestration

Ritesh Noothigattu, Djallel Bouneffouf, Nicholas Mattei et al.

Autonomous cyber-physical agents and systems play an increasingly large role in our lives. To ensure that agents behave in ways aligned with the values of the societies in which they operate, we must develop techniques that allow these agents to not only maximize their reward in an environment, but also to learn and follow the implicit constraints of society. These constraints and norms can come from any number of sources including regulations, business process guidelines, laws, ethical principles, social norms, and moral values. We detail a novel approach that uses inverse reinforcement learning to learn a set of unspecified constraints from demonstrations of the task, and reinforcement learning to learn to maximize the environment rewards. More precisely, we assume that an agent can observe traces of behavior of members of the society but has no access to the explicit set of constraints that give rise to the observed behavior. Inverse reinforcement learning is used to learn such constraints, that are then combined with a possibly orthogonal value function through the use of a contextual bandit-based orchestrator that picks a contextually-appropriate choice between the two policies (constraint-based and environment reward-based) when taking actions. The contextual bandit orchestrator allows the agent to mix policies in novel ways, taking the best actions from either a reward maximizing or constrained policy. In addition, the orchestrator is transparent on which policy is being employed at each time step. We test our algorithms using a Pac-Man domain and show that the agent is able to learn to act optimally, act within the demonstrated constraints, and mix these two functions in complex ways.

AISep 15, 2018
Answering Science Exam Questions Using Query Rewriting with Background Knowledge

Ryan Musa, Xiaoyan Wang, Achille Fokoue et al.

Open-domain question answering (QA) is an important problem in AI and NLP that is emerging as a bellwether for progress on the generalizability of AI methods and techniques. Much of the progress in open-domain QA systems has been realized through advances in information retrieval methods and corpus construction. In this paper, we focus on the recently introduced ARC Challenge dataset, which contains 2,590 multiple choice questions authored for grade-school science exams. These questions are selected to be the most challenging for current QA systems, and current state of the art performance is only slightly better than random chance. We present a system that rewrites a given question into queries that are used to retrieve supporting text from a large corpus of science-related text. Our rewriter is able to incorporate background knowledge from ConceptNet and -- in tandem with a generic textual entailment system trained on SciTail that identifies support in the retrieved results -- outperforms several strong baselines on the end-to-end QA task despite only being trained to identify essential terms in the original source question. We use a generalizable decision methodology over the retrieved evidence and answer candidates to select the best answer. By combining query rewriting, background knowledge, and textual entailment our system is able to outperform several strong baselines on the ARC dataset.

AISep 15, 2018
Improving Natural Language Inference Using External Knowledge in the Science Questions Domain

Xiaoyan Wang, Pavan Kapanipathi, Ryan Musa et al.

Natural Language Inference (NLI) is fundamental to many Natural Language Processing (NLP) applications including semantic search and question answering. The NLI problem has gained significant attention thanks to the release of large scale, challenging datasets. Present approaches to the problem largely focus on learning-based methods that use only textual information in order to classify whether a given premise entails, contradicts, or is neutral with respect to a given hypothesis. Surprisingly, the use of methods based on structured knowledge -- a central topic in artificial intelligence -- has not received much attention vis-a-vis the NLI problem. While there are many open knowledge bases that contain various types of reasoning information, their use for NLI has not been well explored. To address this, we present a combination of techniques that harness knowledge graphs to improve performance on the NLI problem in the science questions domain. We present the results of applying our techniques on text, graph, and text-to-graph based models, and discuss implications for the use of external knowledge in solving the NLI problem. Our model achieves the new state-of-the-art performance on the NLI problem over the SciTail science questions dataset.

AISep 15, 2018
Incorporating Behavioral Constraints in Online AI Systems

Avinash Balakrishnan, Djallel Bouneffouf, Nicholas Mattei et al.

AI systems that learn through reward feedback about the actions they take are increasingly deployed in domains that have significant impact on our daily life. However, in many cases the online rewards should not be the only guiding criteria, as there are additional constraints and/or priorities imposed by regulations, values, preferences, or ethical principles. We detail a novel online agent that learns a set of behavioral constraints by observation and uses these learned constraints as a guide when making decisions in an online setting while still being reactive to reward feedback. To define this agent, we propose to adopt a novel extension to the classical contextual multi-armed bandit setting and we provide a new algorithm called Behavior Constrained Thompson Sampling (BCTS) that allows for online learning while obeying exogenous constraints. Our agent learns a constrained policy that implements the observed behavioral constraints demonstrated by a teacher agent, and then uses this constrained policy to guide the reward-based online exploration and exploitation. We characterize the upper bound on the expected regret of the contextual bandit algorithm that underlies our agent and provide a case study with real world data in two application domains. Our experiments show that the designed agent is able to act within the set of behavior constraints without significantly degrading its overall reward performance.

AIJun 1, 2018
A Systematic Classification of Knowledge, Reasoning, and Context within the ARC Dataset

Michael Boratko, Harshit Padigela, Divyendra Mikkilineni et al.

The recent work of Clark et al. introduces the AI2 Reasoning Challenge (ARC) and the associated ARC dataset that partitions open domain, complex science questions into an Easy Set and a Challenge Set. That paper includes an analysis of 100 questions with respect to the types of knowledge and reasoning required to answer them; however, it does not include clear definitions of these types, nor does it offer information about the quality of the labels. We propose a comprehensive set of definitions of knowledge and reasoning types necessary for answering the questions in the ARC dataset. Using ten annotators and a sophisticated annotation interface, we analyze the distribution of labels across the Challenge Set and statistics related to them. Additionally, we demonstrate that although naive information retrieval methods return sentences that are irrelevant to answering the query, sufficient supporting text is often present in the (ARC) corpus. Evaluating with human-selected relevant sentences improves the performance of a neural machine comprehension model by 42 points.

LGMay 14, 2018
A Cost-Effective Framework for Preference Elicitation and Aggregation

Zhibing Zhao, Haoming Li, Junming Wang et al.

We propose a cost-effective framework for preference elicitation and aggregation under the Plackett-Luce model with features. Given a budget, our framework iteratively computes the most cost-effective elicitation questions in order to help the agents make a better group decision. We illustrate the viability of the framework with experiments on Amazon Mechanical Turk, which we use to estimate the cost of answering different types of elicitation questions. We compare the prediction accuracy of our framework when adopting various information criteria that evaluate the expected information gain from a question. Our experiments show carefully designed information criteria are much more efficient, i.e., they arrive at the correct answer using fewer queries, than randomly asking questions given the budget constraint.

AIMay 19, 2017
The Conference Paper Assignment Problem: Using Order Weighted Averages to Assign Indivisible Goods

Jing Wu Lian, Nicholas Mattei, Renee Noble et al.

Motivated by the common academic problem of allocating papers to referees for conference reviewing we propose a novel mechanism for solving the assignment problem when we have a two sided matching problem with preferences from one side (the agents/reviewers) over the other side (the objects/papers) and both sides have capacity constraints. The assignment problem is a fundamental problem in both computer science and economics with application in many areas including task and resource allocation. We draw inspiration from multi-criteria decision making and voting and use order weighted averages (OWAs) to propose a novel and flexible class of algorithms for the assignment problem. We show an algorithm for finding a $Σ$-OWA assignment in polynomial time, in contrast to the NP-hardness of finding an egalitarian assignment. Inspired by this setting we observe an interesting connection between our model and the classic proportional multi-winner election problem in social choice.

AIJan 26, 2017
Ethical Considerations in Artificial Intelligence Courses

Emanuelle Burton, Judy Goldsmith, Sven Koenig et al.

The recent surge in interest in ethics in artificial intelligence may leave many educators wondering how to address moral, ethical, and philosophical issues in their AI courses. As instructors we want to develop curriculum that not only prepares students to be artificial intelligence practitioners, but also to understand the moral, ethical, and philosophical impacts that artificial intelligence will have on society. In this article we provide practical case studies and links to resources for use by AI educators. We also provide concrete suggestions on how to integrate AI ethics into a general artificial intelligence course and how to teach a stand-alone artificial intelligence ethics course.

GTAug 3, 2016
Empirical Evaluation of Real World Tournaments

Nicholas Mattei, Toby Walsh

Computational Social Choice (ComSoc) is a rapidly developing field at the intersection of computer science, economics, social choice, and political science. The study of tournaments is fundamental to ComSoc and many results have been published about tournament solution sets and reasoning in tournaments. Theoretical results in ComSoc tend to be worst case and tell us little about performance in practice. To this end we detail some experiments on tournaments using real wold data from soccer and tennis. We make three main contributions to the understanding of tournaments using real world data from English Premier League, the German Bundesliga, and the ATP World Tour: (1) we find that the NP-hard question of finding a seeding for which a given team can win a tournament is easily solvable in real world instances, (2) using detailed and principled methodology from statistical physics we show that our real world data obeys a log-normal distribution; and (3) leveraging our log-normal distribution result and using robust statistical methods, we show that the popular Condorcet Random (CR) tournament model does not generate realistic tournament data.

GTMay 31, 2016
Interdependent Scheduling Games

Andres Abeliuk, Haris Aziz, Gerardo Berbeglia et al.

We propose a model of interdependent scheduling games in which each player controls a set of services that they schedule independently. A player is free to schedule his own services at any time; however, each of these services only begins to accrue reward for the player when all predecessor services, which may or may not be controlled by the same player, have been activated. This model, where players have interdependent services, is motivated by the problems faced in planning and coordinating large-scale infrastructures, e.g., restoring electricity and gas to residents after a natural disaster or providing medical care in a crisis when different agencies are responsible for the delivery of staff, equipment, and medicine. We undertake a game-theoretic analysis of this setting and in particular consider the issues of welfare maximization, computing best responses, Nash dynamics, and existence and computation of Nash equilibria.

GTApr 13, 2016
Strategyproof Peer Selection using Randomization, Partitioning, and Apportionment

Haris Aziz, Omer Lev, Nicholas Mattei et al.

Peer reviews, evaluations, and selections are a fundamental aspect of modern science. Funding bodies the world over employ experts to review and select the best proposals from those submitted for funding. The problem of peer selection, however, is much more general: a professional society may want to give a subset of its members awards based on the opinions of all members; an instructor for a Massive Open Online Course (MOOC) or an online course may want to crowdsource grading; or a marketing company may select ideas from group brainstorming sessions based on peer evaluation. We make three fundamental contributions to the study of peer selection, a specific type of group decision-making problem, studied in computer science, economics, and political science. First, we propose a novel mechanism that is strategyproof, i.e., agents cannot benefit by reporting insincere valuations. Second, we demonstrate the effectiveness of our mechanism by a comprehensive simulation-based comparison with a suite of mechanisms found in the literature. Finally, our mechanism employs a randomized rounding technique that is of independent interest, as it solves the apportionment problem that arises in various settings where discrete resources such as parliamentary representation slots need to be divided proportionally.

GTAug 21, 2014
A Study of Proxies for Shapley Allocations of Transport Costs

Haris Aziz, Casey Cahan, Charles Gretton et al.

We propose and evaluate a number of solutions to the problem of calculating the cost to serve each location in a single-vehicle transport setting. Such cost to serve analysis has application both strategically and operationally in transportation. The problem is formally given by the traveling salesperson game (TSG), a cooperative total utility game in which agents correspond to locations in a traveling salesperson problem (TSP). The cost to serve a location is an allocated portion of the cost of an optimal tour. The Shapley value is one of the most important normative division schemes in cooperative games, giving a principled and fair allocation both for the TSG and more generally. We consider a number of direct and sampling-based procedures for calculating the Shapley value, and present the first proof that approximating the Shapley value of the TSG within a constant factor is NP-hard. Treating the Shapley value as an ideal baseline allocation, we then develop six proxies for that value which are relatively easy to compute. We perform an experimental evaluation using Synthetic Euclidean games as well as games derived from real-world tours calculated for fast-moving consumer goods scenarios. Our experiments show that several computationally tractable allocation techniques correspond to good proxies for the Shapley value.

GTJul 11, 2014
Computational Aspects of Multi-Winner Approval Voting

Haris Aziz, Serge Gaspers, Joachim Gudmundsson et al.

We study computational aspects of three prominent voting rules that use approval ballots to elect multiple winners. These rules are satisfaction approval voting, proportional approval voting, and reweighted approval voting. We first show that computing the winner for proportional approval voting is NP-hard, closing a long standing open problem. As none of the rules are strategyproof, even for dichotomous preferences, we study various strategic aspects of the rules. In particular, we examine the computational complexity of computing a best response for both a single agent and a group of agents. In many settings, we show that it is NP-hard for an agent or agents to compute how best to vote given a fixed set of approval ballots from the other agents.

AIApr 23, 2013
How Hard Is It to Control an Election by Breaking Ties?

Nicholas Mattei, Nina Narodytska, Toby Walsh

We study the computational complexity of controlling the result of an election by breaking ties strategically. This problem is equivalent to the problem of deciding the winner of an election under parallel universes tie-breaking. When the chair of the election is only asked to break ties to choose between one of the co-winners, the problem is trivially easy. However, in multi-round elections, we prove that it can be NP-hard for the chair to compute how to break ties to ensure a given result. Additionally, we show that the form of the tie-breaking function can increase the opportunities for control. Indeed, we prove that it can be NP-hard to control an election by breaking ties even with a two-stage voting rule.