Mohak Goyal

3papers

1citation

Novelty52%

AI Score36

Ranked #118,075 of 201,326 authors (top 59%)#7,857 in AI (top 55%)

3 Papers

13.5SIApr 13

Quality-Sensitive Matrix Factorization for Community Notes: Towards Sample Efficiency and Manipulation Resistance

Mohak Goyal, Nishka Arora, Ashish Goel

Community Notes is X's crowdsourced fact-checking program: contributors write short notes that add context to potentially misleading posts, and other contributors rate whether those notes are helpful. Its algorithm uses a matrix factorization model to separate ideology from note quality, so notes are surfaced only when they receive support across ideological lines. After ideology is accounted for, however, the model gives all raters equal influence on quality estimates. This slows consensus formation and leaves the quality estimate vulnerable to noisy or strategic raters. We propose Quality-Sensitive Matrix Factorization (QSMF), which uses a per-rater quality-sensitivity parameter \(\hatρ_i\) estimated jointly with all other parameters. This connects QSMF to peer prediction: without external ground truth, it gives more influence to raters whose ideology-adjusted ratings are more consistent with the note-quality estimates learned from all the ratings. We evaluate QSMF on 45M ratings over 365K notes from the six months before the 2024 U.S. presidential election. Split-half tests confirm that quality sensitivity is a stable, empirically recoverable rater trait. In evaluation on high-traffic notes, QSMF requires 26--40\% fewer ratings to match the baseline's accuracy. In semi-synthetic coordinated attacks on notes of opposing ideology, QSMF substantially reduces displacement on the estimated quality estimates of targeted notes relative to the baseline. In synthetic data with known ground truth, \(\hatρ_i\) separates good from bad raters with an AUC above 0.94, and achieves much lower error in recovering the true note quality estimates in the presence of bad raters. These gains come from a single additional scalar parameter per rater, with no external ground truth and no manual moderation.

AIAug 21, 2024

Estimating Contribution Quality in Online Deliberations Using a Large Language Model

Lodewijk Gelauff, Mohak Goyal, Bhargav Dindukurthi et al.

Deliberation involves participants exchanging knowledge, arguments, and perspectives and has been shown to be effective at addressing polarization. The Stanford Online Deliberation Platform facilitates large-scale deliberations. It enables video-based online discussions on a structured agenda for small groups without requiring human moderators. This paper's data comes from various deliberation events, including one conducted in collaboration with Meta in 32 countries, and another with 38 post-secondary institutions in the US. Estimating the quality of contributions in a conversation is crucial for assessing feature and intervention impacts. Traditionally, this is done by human annotators, which is time-consuming and costly. We use a large language model (LLM) alongside eight human annotators to rate contributions based on justification, novelty, expansion of the conversation, and potential for further expansion, with scores ranging from 1 to 5. Annotators also provide brief justifications for their ratings. Using the average rating from other human annotators as the ground truth, we find the model outperforms individual human annotators. While pairs of human annotators outperform the model in rating justification and groups of three outperform it on all four metrics, the model remains competitive. We illustrate the usefulness of the automated quality rating by assessing the effect of nudges on the quality of deliberation. We first observe that individual nudges after prolonged inactivity are highly effective, increasing the likelihood of the individual requesting to speak in the next 30 seconds by 65%. Using our automated quality estimation, we show that the quality ratings for statements prompted by nudging are similar to those made without nudging, signifying that nudging leads to more ideas being generated in the conversation without losing overall quality.

SPJan 9, 2022

Signal Reconstruction from Quantized Noisy Samples of the Discrete Fourier Transform

Mohak Goyal, Animesh Kumar

In this paper, we present two variations of an algorithm for signal reconstruction from one-bit or two-bit noisy observations of the discrete Fourier transform (DFT). The one-bit observations of the DFT correspond to the sign of its real part, whereas, the two-bit observations of the DFT correspond to the signs of both the real and imaginary parts of the DFT. We focus on images for analysis and simulations, thus using the sign of the 2D-DFT. This choice of the class of signals is inspired by previous works on this problem. For our algorithm, we show that the expected mean squared error (MSE) in signal reconstruction is asymptotically proportional to the inverse of the sampling rate. The samples are affected by additive zero-mean noise of known distribution. We solve this signal estimation problem by designing an algorithm that uses contraction mapping, based on the Banach fixed point theorem. Numerical tests with four benchmark images are provided to show the effectiveness of our algorithm. Various metrics for image reconstruction quality assessment such as PSNR, SSIM, ESSIM, and MS-SSIM are employed. On all four benchmark images, our algorithm outperforms the state-of-the-art in all of these metrics by a significant margin.