HCMar 4
A Systematic Review of User Experiments Measuring the Effects of Dark PatternsBrennan Schaffner, Luis Heysen, Marshini Chetty
Deceptive/Manipulative Patterns (DMP) are interface designs, also known as ``dark patterns,'' that manipulate user behavior. While considerable attention has been paid to their ethical and legal implications, empirical evidence about their real-world effects remains diffuse. This review synthesizes up-to-date experimental studies, focusing on works that quantify how (or whether) DMPs influence users. We also aggregate findings on interventions aimed at reducing DMP effects. Our synthesis highlights the experimental agreement that DMPs do significantly alter user behavior (with large variance in effect size) and that external interventions have been mostly unsuccessful in mitigating their effects. Lastly, we show that significant correlations between DMP effects and personal characteristics (e.g., age or political affiliation) are uncommon, indicating DMPs similarly affected nearly all populations tested. By summarizing the experimental evidence, we clarify the effects of DMPs, highlight gaps and tensions in the existing experimental literature, and help inform ongoing research and policy directions.
CLJun 9, 2025
Silencing Empowerment, Allowing Bigotry: Auditing the Moderation of Hate Speech on TwitchPrarabdh Shukla, Wei Yin Chong, Yash Patel et al.
To meet the demands of content moderation, online platforms have resorted to automated systems. Newer forms of real-time engagement($\textit{e.g.}$, users commenting on live streams) on platforms like Twitch exert additional pressures on the latency expected of such moderation systems. Despite their prevalence, relatively little is known about the effectiveness of these systems. In this paper, we conduct an audit of Twitch's automated moderation tool ($\texttt{AutoMod}$) to investigate its effectiveness in flagging hateful content. For our audit, we create streaming accounts to act as siloed test beds, and interface with the live chat using Twitch's APIs to send over $107,000$ comments collated from $4$ datasets. We measure $\texttt{AutoMod}$'s accuracy in flagging blatantly hateful content containing misogyny, racism, ableism and homophobia. Our experiments reveal that a large fraction of hateful messages, up to $94\%$ on some datasets, $\textit{bypass moderation}$. Contextual addition of slurs to these messages results in $100\%$ removal, revealing $\texttt{AutoMod}$'s reliance on slurs as a moderation signal. We also find that contrary to Twitch's community guidelines, $\texttt{AutoMod}$ blocks up to $89.5\%$ of benign examples that use sensitive words in pedagogical or empowering contexts. Overall, our audit points to large gaps in $\texttt{AutoMod}$'s capabilities and underscores the importance for such systems to understand context effectively.