Walter Quattrociocchi

h-index41

9papers

1,272citations

Novelty32%

AI Score40

Ranked #70,792 of 194,257 authors (top 36%)#73 in SI (top 22%)

9 Papers

9.7CYDec 22, 2025

Epistemological Fault Lines Between Human and Artificial Intelligence

Walter Quattrociocchi, Valerio Capraro, Matjaž Perc

Large language models (LLMs) are widely described as artificial intelligence, yet their epistemic profile diverges sharply from human cognition. Here we show that the apparent alignment between human and machine outputs conceals a deeper structural mismatch in how judgments are produced. Tracing the historical shift from symbolic AI and information filtering systems to large-scale generative transformers, we argue that LLMs are not epistemic agents but stochastic pattern-completion systems, formally describable as walks on high-dimensional graphs of linguistic transitions rather than as systems that form beliefs or models of the world. By systematically mapping human and artificial epistemic pipelines, we identify seven epistemic fault lines, divergences in grounding, parsing, experience, motivation, causal reasoning, metacognition, and value. We call the resulting condition Epistemia: a structural situation in which linguistic plausibility substitutes for epistemic evaluation, producing the feeling of knowing without the labor of judgment. We conclude by outlining consequences for evaluation, governance, and epistemic literacy in societies increasingly organized around generative AI.

0.6CLFeb 20

The Statistical Signature of LLMs

Ortal Hadad, Edoardo Loru, Jacopo Nudo et al.

Large language models generate text through probabilistic sampling from high-dimensional distributions, yet how this process reshapes the structural statistical organization of language remains incompletely characterized. Here we show that lossless compression provides a simple, model-agnostic measure of statistical regularity that differentiates generative regimes directly from surface text. We analyze compression behavior across three progressively more complex information ecosystems: controlled human-LLM continuations, generative mediation of a knowledge infrastructure (Wikipedia vs. Grokipedia), and fully synthetic social interaction environments (Moltbook vs. Reddit). Across settings, compression reveals a persistent structural signature of probabilistic generation. In controlled and mediated contexts, LLM-produced language exhibits higher structural regularity and compressibility than human-written text, consistent with a concentration of output within highly recurrent statistical patterns. However, this signature shows scale dependence: in fragmented interaction environments the separation attenuates, suggesting a fundamental limit to surface-level distinguishability at small scales. This compressibility-based separation emerges consistently across models, tasks, and domains and can be observed directly from surface text without relying on model internals or semantic evaluation. Overall, our findings introduce a simple and robust framework for quantifying how generative systems reshape textual production, offering a structural perspective on the evolving complexity of communication.

3.3SIMay 28, 2021

Online Hate: Behavioural Dynamics and Relationship with Misinformation

Matteo Cinelli, Andraž Pelicon, Igor Mozetič et al.

Online debates are often characterised by extreme polarisation and heated discussions among users. The presence of hate speech online is becoming increasingly problematic, making necessary the development of appropriate countermeasures. In this work, we perform hate speech detection on a corpus of more than one million comments on YouTube videos through a machine learning model fine-tuned on a large set of hand-annotated data. Our analysis shows that there is no evidence of the presence of "serial haters", intended as active users posting exclusively hateful comments. Moreover, coherently with the echo chamber hypothesis, we find that users skewed towards one of the two categories of video channels (questionable, reliable) are more prone to use inappropriate, violent, or hateful language within their opponents community. Interestingly, users loyal to reliable sources use on average a more toxic language than their counterpart. Finally, we find that the overall toxicity of the discussion increases with its length, measured both in terms of number of comments and time. Our results show that, coherently with Godwin's law, online debates tend to degenerate towards increasingly toxic exchanges of views.

17.0SIFeb 12, 2019

RTbust: Exploiting Temporal Patterns for Botnet Detection on Twitter

Michele Mazza, Stefano Cresci, Marco Avvenuti et al.

Within OSNs, many of our supposedly online friends may instead be fake accounts called social bots, part of large groups that purposely re-share targeted content. Here, we study retweeting behaviors on Twitter, with the ultimate goal of detecting retweeting social bots. We collect a dataset of 10M retweets. We design a novel visualization that we leverage to highlight benign and malicious patterns of retweeting activity. In this way, we uncover a 'normal' retweeting pattern that is peculiar of human-operated accounts, and 3 suspicious patterns related to bot activities. Then, we propose a bot detection technique that stems from the previous exploration of retweeting behaviors. Our technique, called Retweet-Buster (RTbust), leverages unsupervised feature extraction and clustering. An LSTM autoencoder converts the retweet time series into compact and informative latent feature vectors, which are then clustered with a hierarchical density-based algorithm. Accounts belonging to large clusters characterized by malicious retweeting patterns are labeled as bots. RTbust obtains excellent detection results, with F1 = 0.87, whereas competitors achieve F1 < 0.76. Finally, we apply RTbust to a large dataset of retweets, uncovering 2 previously unknown active botnets with hundreds of accounts.

11.3CYOct 14, 2015

Debunking in a World of Tribes

Fabiana Zollo, Alessandro Bessi, Michela Del Vicario et al.

Recently a simple military exercise on the Internet was perceived as the beginning of a new civil war in the US. Social media aggregate people around common interests eliciting a collective framing of narratives and worldviews. However, the wide availability of user-provided content and the direct path between producers and consumers of information often foster confusion about causations, encouraging mistrust, rumors, and even conspiracy thinking. In order to contrast such a trend attempts to \textit{debunk} are often undertaken. Here, we examine the effectiveness of debunking through a quantitative analysis of 54 million users over a time span of five years (Jan 2010, Dec 2014). In particular, we compare how users interact with proven (scientific) and unsubstantiated (conspiracy-like) information on Facebook in the US. Our findings confirm the existence of echo chambers where users interact primarily with either conspiracy-like or scientific pages. Both groups interact similarly with the information within their echo chamber. We examine 47,780 debunking posts and find that attempts at debunking are largely ineffective. For one, only a small fraction of usual consumers of unsubstantiated information interact with the posts. Furthermore, we show that those few are often the most committed conspiracy users and rather than internalizing debunking information, they often react to it negatively. Indeed, after interacting with debunking posts, users retain, or even increase, their engagement within the conspiracy echo chamber.

5.9CYSep 1, 2015

Echo chambers in the age of misinformation

Michela Del Vicario, Alessandro Bessi, Fabiana Zollo et al.

The wide availability of user-provided content in online social media facilitates the aggregation of people around common interests, worldviews, and narratives. Despite the enthusiastic rhetoric on the part of some that this process generates "collective intelligence", the WWW also allows the rapid dissemination of unsubstantiated conspiracy theories that often elicite rapid, large, but naive social responses such as the recent case of Jade Helm 15 -- where a simple military exercise turned out to be perceived as the beginning of the civil war in the US. We study how Facebook users consume information related to two different kinds of narrative: scientific and conspiracy news. We find that although consumers of scientific and conspiracy stories present similar consumption patterns with respect to content, the sizes of the spreading cascades differ. Homogeneity appears to be the primary driver for the diffusion of contents, but each echo chamber has its own cascade dynamics. To mimic these dynamics, we introduce a data-driven percolation model on signed networks.

5.9SIApr 20, 2015

Trend of Narratives in the Age of Misinformation

Alessandro Bessi, Fabiana Zollo, Michela Del Vicario et al.

Social media enabled a direct path from producer to consumer of contents changing the way users get informed, debate, and shape their worldviews. Such a {\em disintermediation} weakened consensus on social relevant issues in favor of rumors, mistrust, and fomented conspiracy thinking -- e.g., chem-trails inducing global warming, the link between vaccines and autism, or the New World Order conspiracy. In this work, we study through a thorough quantitative analysis how different conspiracy topics are consumed in the Italian Facebook. By means of a semi-automatic topic extraction strategy, we show that the most discussed contents semantically refer to four specific categories: {\em environment}, {\em diet}, {\em health}, and {\em geopolitics}. We find similar patterns by comparing users activity (likes and comments) on posts belonging to different semantic categories. However, if we focus on the lifetime -- i.e., the distance in time between the first and the last comment for each user -- we notice a remarkable difference within narratives -- e.g., users polarized on geopolitics are more persistent in commenting, whereas the less persistent are those focused on diet related topics. Finally, we model users mobility across various topics finding that the more a user is active, the more he is likely to join all topics. Once inside a conspiracy narrative users tend to embrace the overall corpus.

2.3SIJan 28, 2015

Everyday the Same Picture: Popularity and Content Diversity

Alessandro Bessi, Fabiana Zollo, Michela Del Vicario et al.

Facebook is flooded by diverse and heterogeneous content, from kittens up to music and news, passing through satirical and funny stories. Each piece of that corpus reflects the heterogeneity of the underlying social background. In the Italian Facebook we have found an interesting case: a page having more than $40K$ followers that every day posts the same picture of a popular Italian singer. In this work, we use such a page as a control to study and model the relationship between content heterogeneity on popularity. In particular, we use that page for a comparative analysis of information consumption patterns with respect to pages posting science and conspiracy news. In total, we analyze about $2M$ likes and $190K$ comments, made by approximately $340K$ and $65K$ users, respectively. We conclude the paper by introducing a model mimicking users selection preferences accounting for the heterogeneity of contents.

12.6SIAug 7, 2014

Science vs Conspiracy: collective narratives in the age of (mis)information

Alessandro Bessi, Mauro Coletto, George Alexandru Davidescu et al.

The large availability of user provided contents on online social media facilitates people aggregation around common interests, worldviews and narratives. However, in spite of the enthusiastic rhetoric about the so called {\em wisdom of crowds}, unsubstantiated rumors -- as alternative explanation to main stream versions of complex phenomena -- find on the Web a natural medium for their dissemination. In this work we study, on a sample of 1.2 million of individuals, how information related to very distinct narratives -- i.e. main stream scientific and alternative news -- are consumed on Facebook. Through a thorough quantitative analysis, we show that distinct communities with similar information consumption patterns emerge around distinctive narratives. Moreover, consumers of alternative news (mainly conspiracy theories) result to be more focused on their contents, while scientific news consumers are more prone to comment on alternative news. We conclude our analysis testing the response of this social system to 4709 troll information -- i.e. parodistic imitation of alternative and conspiracy theories. We find that, despite the false and satirical vein of news, usual consumers of conspiracy news are the most prone to interact with them.