46.6GTMay 30
Combatting Gerrymandering with Ranked Choice Voting: An Experimental Analysis of Multi-member Districts in the United StatesNikhil Garg, Wes Gurnee, David Rothschild et al.
Every representative democracy must specify a mechanism under which voters choose their representatives. The most common mechanism in the United States -- Winner takes all single-member districts -- both enables substantial partisan gerrymandering and constrains `fair' redistricting, preventing proportional representation in legislatures. We study the design of \textit{multi-member districts (MMDs)}, in which each district elects multiple representatives, potentially through a non-Winner takes all voting rule. We carry out large-scale empirical analyses for the U.S. House of Representatives under MMDs with different social choice functions, under algorithmically generated maps optimized for either partisan benefit or proportionality. Doing so requires efficiently incorporating predicted partisan outcomes -- under various multi-winner social choice functions -- into an algorithm that optimizes over an ensemble of maps. We find that with three-member districts using Single Transferable Vote, fairness-minded independent commissions would be able to achieve proportional outcomes in every state up to rounding, \textit{and} advantage-seeking partisans would have their power to gerrymander significantly curtailed. Simultaneously, such districts would preserve geographic cohesion. Through simulation, we find that the insights are robust to cross-party voting. In the process, we advance a rich research agenda at the intersection of social choice and computational gerrymandering.
CLOct 24, 2023
Prevalence and prevention of large language model use in crowd workVeniamin Veselovsky, Manoel Horta Ribeiro, Philip Cozzolino et al.
We show that the use of large language models (LLMs) is prevalent among crowd workers, and that targeted mitigation strategies can significantly reduce, but not eliminate, LLM use. On a text summarization task where workers were not directed in any way regarding their LLM use, the estimated prevalence of LLM use was around 30%, but was reduced by about half by asking workers to not use LLMs and by raising the cost of using them, e.g., by disabling copy-pasting. Secondary analyses give further insight into LLM use and its prevention: LLM use yields high-quality but homogeneous responses, which may harm research concerned with human (rather than model) behavior and degrade future models trained with crowdsourced data. At the same time, preventing LLM use may be at odds with obtaining high-quality responses; e.g., when requesting workers not to use LLMs, summaries contained fewer keywords carrying essential information. Our estimates will likely change as LLMs increase in popularity or capabilities, and as norms around their usage change. Yet, understanding the co-evolution of LLM-based tools and users is key to maintaining the validity of research done using crowdsourcing, and we provide a critical baseline before widespread adoption ensues.
CLFeb 22, 2024
Framing in the Presence of Supporting Data: A Case Study in U.S. Economic NewsAlexandria Leto, Elliot Pickens, Coen D. Needell et al.
The mainstream media has much leeway in what it chooses to cover and how it covers it. These choices have real-world consequences on what people know and their subsequent behaviors. However, the lack of objective measures to evaluate editorial choices makes research in this area particularly difficult. In this paper, we argue that there are newsworthy topics where objective measures exist in the form of supporting data and propose a computational framework to analyze editorial choices in this setup. We focus on the economy because the reporting of economic indicators presents us with a relatively easy way to determine both the selection and framing of various publications. Their values provide a ground truth of how the economy is doing relative to how the publications choose to cover it. To do this, we define frame prediction as a set of interdependent tasks. At the article level, we learn to identify the reported stance towards the general state of the economy. Then, for every numerical quantity reported in the article, we learn to identify whether it corresponds to an economic indicator and whether it is being reported in a positive or negative way. To perform our analysis, we track six American publishers and each article that appeared in the top 10 slots of their landing page between 2015 and 2023.