Amaury Trujillo

SI
h-index36
6papers
160citations
Novelty39%
AI Score43

6 Papers

SIMay 19, 2022
Personalized Interventions for Online Moderation

Stefano Cresci, Amaury Trujillo, Tiziano Fagni

Current online moderation follows a one-size-fits-all approach, where each intervention is applied in the same way to all users. This naive approach is challenged by established socio-behavioral theories and by recent empirical results that showed the limited effectiveness of such interventions. We propose a paradigm-shift in online moderation by moving towards a personalized and user-centered approach. Our multidisciplinary vision combines state-of-the-art theories and practices in diverse fields such as computer science, sociology and psychology, to design personalized moderation interventions (PMIs). In outlining the path leading to the next-generation of moderation interventions, we also discuss the most prominent challenges introduced by such a disruptive change.

CYMay 17
Disarranged Harmonization of Transparency Reporting by Social Media Platforms Under the Digital Services Act

Amaury Trujillo, Benedetta Tessa, Stefano Cresci

The European Commission recently introduced new regulation to harmonize transparency reporting of large online platforms under the Digital Services Act (DSA). Here, we present the first systematic evaluation of transparency reporting data quality after this normative change, for the eight largest social media platforms in the European Union. In detail, we run a set of large-scale quantitative analyses on key reporting dimensions, followed by a structured comparative assessment across platforms and reporting mechanisms. Among our findings is that: (i) the analyzed platforms had varying degrees of compliance and data quality, but all exhibited issues on data formatting, timeliness, consistency, and completeness; (ii) some platforms employed differing reporting procedures across mechanisms, which caused them to submit contrasting information; (iii) despite the harmonization, a number of issues still prevent interoperability between reporting mechanisms; and (iv) many of the previously identified issues with transparency reporting are still unresolved. We conclude by discussing implications for transparency auditing and proposing key targeted improvements to strengthen the reliability and interoperability of DSA transparency reporting.

SIDec 16, 2023
The DSA Transparency Database: Auditing Self-reported Moderation Actions by Social Media

Amaury Trujillo, Tiziano Fagni, Stefano Cresci

Since September 2023, the Digital Services Act (DSA) obliges large online platforms to submit detailed data on each moderation action they take within the European Union (EU) to the DSA Transparency Database. From its inception, this centralized database has sparked scholarly interest as an unprecedented and potentially unique trove of data on real-world online moderation. Here, we thoroughly analyze all 353.12M records submitted by the eight largest social media platforms in the EU during the first 100 days of the database. Specifically, we conduct a platform-wise comparative study of their: volume of moderation actions, grounds for decision, types of applied restrictions, types of moderated content, timeliness in undertaking and submitting moderation actions, and use of automation. Furthermore, we systematically cross-check the contents of the database with the platforms' own transparency reports. Our analyses reveal that (i) the platforms adhered only in part to the philosophy and structure of the database, (ii) the structure of the database is partially inadequate for the platforms' reporting needs, (iii) the platforms exhibited substantial differences in their moderation actions, (iv) a remarkable fraction of the database data is inconsistent, (v) the platform X (formerly Twitter) presents the most inconsistencies. Our findings have far-reaching implications for policymakers and scholars across diverse disciplines. They offer guidance for future regulations that cater to the reporting needs of online platforms in general, but also highlight opportunities to improve and refine the database itself.

HCDec 10, 2024
Contextualized Counterspeech: Strategies for Adaptation, Personalization, and Evaluation

Lorenzo Cima, Alessio Miaschi, Amaury Trujillo et al.

AI-generated counterspeech offers a promising and scalable strategy to curb online toxicity through direct replies that promote civil discourse. However, current counterspeech is one-size-fits-all, lacking adaptation to the moderation context and the users involved. We propose and evaluate multiple strategies for generating tailored counterspeech that is adapted to the moderation context and personalized for the moderated user. We instruct an LLaMA2-13B model to generate counterspeech, experimenting with various configurations based on different contextual information and fine-tuning strategies. We identify the configurations that generate persuasive counterspeech through a combination of quantitative indicators and human evaluations collected via a pre-registered mixed-design crowdsourcing experiment. Results show that contextualized counterspeech can significantly outperform state-of-the-art generic counterspeech in adequacy and persuasiveness, without compromising other characteristics. Our findings also reveal a poor correlation between quantitative indicators and human evaluations, suggesting that these methods assess different aspects and highlighting the need for nuanced evaluation methodologies. The effectiveness of contextualized AI-generated counterspeech and the divergence between human and algorithmic evaluations underscore the importance of increased human-AI collaboration in content moderation.

SIApr 21
When Transparency Falls Short: Auditing Platform Moderation During a High-Stakes Election

Benedetta Tessa, Gautam Kishore Shahi, Amaury Trujillo et al.

During major political events, social media platforms encounter increased systemic risks. However, it is still unclear if and how they adjust their moderation practices in response. The Digital Services Act Transparency Database provides-for the first time-an opportunity to systematically examine content moderation at scale, allowing researchers and policymakers to evaluate platforms' compliance and effectiveness, especially at high-stakes times. Here we analyze 1.58 billion self-reported moderation actions by the eight largest social media platforms in Europe over an eight-month period surrounding the 2024 European Parliament elections. We found that platforms did not exhibit meaningful signs of adaptation in moderation strategies as their self-reported enforcement patterns did not change significantly around the elections. This raises questions about whether platforms made any concrete adjustments, or whether the structure of the database may have masked them. On top of that, we reveal that initial concerns regarding platforms' transparency and accountability still persist one year after the launch of the Transparency Database. Our findings highlight the limits of current self-regulatory approaches and point to the need for stronger enforcement and better data access mechanisms to ensure that online platforms meet their responsibilities in protecting the democratic processes.

SIJan 17, 2022
Make Reddit Great Again: Assessing Community Effects of Moderation Interventions on r/The_Donald

Amaury Trujillo, Stefano Cresci

The subreddit r/The_Donald was repeatedly denounced as a toxic and misbehaving online community, reasons for which it faced a sequence of increasingly constraining moderation interventions by Reddit administrators. It was quarantined in June 2019, restricted in February 2020, and finally banned in June 2020, but despite precursory work on the matter, the effects of this sequence of interventions are still unclear. In this work, we follow a multidimensional causal inference approach to study data containing more than 15M posts made in a time frame of 2 years, to examine the effects of such interventions inside and outside of the subreddit. We find that the interventions greatly reduced the activity of problematic users. However, the interventions also caused an increase in toxicity and led users to share more polarized and less factual news. In addition, the restriction had stronger effects than the quarantine, and core users of r/The_Donald suffered stronger effects than the rest of users. Overall, our results provide evidence that the interventions had mixed effects and paint a nuanced picture of the consequences of community-level moderation strategies. We conclude by reflecting on the challenges of policing online platforms and on the implications for the design and deployment of moderation interventions.