Walter Quattrociocchi

SI
h-index42
13papers
1,327citations
Novelty35%
AI Score44

13 Papers

SDJan 13, 2025
Decoding Musical Evolution Through Network Science

Niccolo' Di Marco, Edoardo Loru, Alessandro Galeazzi et al.

Music has always been central to human culture, reflecting and shaping traditions, emotions, and societal changes. Technological advancements have transformed how music is created and consumed, influencing tastes and the music itself. In this study, we use Network Science to analyze musical complexity. Drawing on $\approx20,000$ MIDI files across six macro-genres spanning nearly four centuries, we represent each composition as a weighted directed network to study its structural properties. Our results show that Classical and Jazz compositions have higher complexity and melodic diversity than recently developed genres. However, a temporal analysis reveals a trend toward simplification, with even Classical and Jazz nearing the complexity levels of modern genres. This study highlights how digital tools and streaming platforms shape musical evolution, fostering new genres while driving homogenization and simplicity.

CYDec 22, 2025
Epistemological Fault Lines Between Human and Artificial Intelligence

Walter Quattrociocchi, Valerio Capraro, Matjaž Perc

Large language models (LLMs) are widely described as artificial intelligence, yet their epistemic profile diverges sharply from human cognition. Here we show that the apparent alignment between human and machine outputs conceals a deeper structural mismatch in how judgments are produced. Tracing the historical shift from symbolic AI and information filtering systems to large-scale generative transformers, we argue that LLMs are not epistemic agents but stochastic pattern-completion systems, formally describable as walks on high-dimensional graphs of linguistic transitions rather than as systems that form beliefs or models of the world. By systematically mapping human and artificial epistemic pipelines, we identify seven epistemic fault lines, divergences in grounding, parsing, experience, motivation, causal reasoning, metacognition, and value. We call the resulting condition Epistemia: a structural situation in which linguistic plausibility substitutes for epistemic evaluation, producing the feeling of knowing without the labor of judgment. We conclude by outlining consequences for evaluation, governance, and epistemic literacy in societies increasingly organized around generative AI.

CLFeb 6, 2025
The simulation of judgment in LLMs

Edoardo Loru, Jacopo Nudo, Niccolò Di Marco et al.

Large Language Models (LLMs) are increasingly embedded in evaluative processes, from information filtering to assessing and addressing knowledge gaps through explanation and credibility judgments. This raises the need to examine how such evaluations are built, what assumptions they rely on, and how their strategies diverge from those of humans. We benchmark six LLMs against expert ratings--NewsGuard and Media Bias/Fact Check--and against human judgments collected through a controlled experiment. We use news domains purely as a controlled benchmark for evaluative tasks, focusing on the underlying mechanisms rather than on news classification per se. To enable direct comparison, we implement a structured agentic framework in which both models and nonexpert participants follow the same evaluation procedure: selecting criteria, retrieving content, and producing justifications. Despite output alignment, our findings show consistent differences in the observable criteria guiding model evaluations, suggesting that lexical associations and statistical priors could influence evaluations in ways that differ from contextual reasoning. This reliance is associated with systematic effects: political asymmetries and a tendency to confuse linguistic form with epistemic reliability--a dynamic we term epistemia, the illusion of knowledge that emerges when surface plausibility replaces verification. Indeed, delegating judgment to such systems may affect the heuristics underlying evaluative processes, suggesting a shift from normative reasoning toward pattern-based approximation and raising open questions about the role of LLMs in evaluative processes.

HCJul 1, 2025
Generative Exaggeration in LLM Social Agents: Consistency, Bias, and Toxicity

Jacopo Nudo, Mario Edoardo Pandolfo, Edoardo Loru et al.

We investigate how Large Language Models (LLMs) behave when simulating political discourse on social media. Leveraging 21 million interactions on X during the 2024 U.S. presidential election, we construct LLM agents based on 1,186 real users, prompting them to reply to politically salient tweets under controlled conditions. Agents are initialized either with minimal ideological cues (Zero Shot) or recent tweet history (Few Shot), allowing one-to-one comparisons with human replies. We evaluate three model families (Gemini, Mistral, and DeepSeek) across linguistic style, ideological consistency, and toxicity. We find that richer contextualization improves internal consistency but also amplifies polarization, stylized signals, and harmful language. We observe an emergent distortion that we call "generation exaggeration": a systematic amplification of salient traits beyond empirical baselines. Our analysis shows that LLMs do not emulate users, they reconstruct them. Their outputs, indeed, reflect internal optimization dynamics more than observed behavior, introducing structural biases that compromise their reliability as social proxies. This challenges their use in content moderation, deliberative simulations, and policy modeling.

CLFeb 20
The Statistical Signature of LLMs

Ortal Hadad, Edoardo Loru, Jacopo Nudo et al.

Large language models generate text through probabilistic sampling from high-dimensional distributions, yet how this process reshapes the structural statistical organization of language remains incompletely characterized. Here we show that lossless compression provides a simple, model-agnostic measure of statistical regularity that differentiates generative regimes directly from surface text. We analyze compression behavior across three progressively more complex information ecosystems: controlled human-LLM continuations, generative mediation of a knowledge infrastructure (Wikipedia vs. Grokipedia), and fully synthetic social interaction environments (Moltbook vs. Reddit). Across settings, compression reveals a persistent structural signature of probabilistic generation. In controlled and mediated contexts, LLM-produced language exhibits higher structural regularity and compressibility than human-written text, consistent with a concentration of output within highly recurrent statistical patterns. However, this signature shows scale dependence: in fragmented interaction environments the separation attenuates, suggesting a fundamental limit to surface-level distinguishability at small scales. This compressibility-based separation emerges consistently across models, tasks, and domains and can be observed directly from surface text without relying on model internals or semantic evaluation. Overall, our findings introduce a simple and robust framework for quantifying how generative systems reshape textual production, offering a structural perspective on the evolving complexity of communication.

SIMay 28, 2021
Online Hate: Behavioural Dynamics and Relationship with Misinformation

Matteo Cinelli, Andraž Pelicon, Igor Mozetič et al.

Online debates are often characterised by extreme polarisation and heated discussions among users. The presence of hate speech online is becoming increasingly problematic, making necessary the development of appropriate countermeasures. In this work, we perform hate speech detection on a corpus of more than one million comments on YouTube videos through a machine learning model fine-tuned on a large set of hand-annotated data. Our analysis shows that there is no evidence of the presence of "serial haters", intended as active users posting exclusively hateful comments. Moreover, coherently with the echo chamber hypothesis, we find that users skewed towards one of the two categories of video channels (questionable, reliable) are more prone to use inappropriate, violent, or hateful language within their opponents community. Interestingly, users loyal to reliable sources use on average a more toxic language than their counterpart. Finally, we find that the overall toxicity of the discussion increases with its length, measured both in terms of number of comments and time. Our results show that, coherently with Godwin's law, online debates tend to degenerate towards increasingly toxic exchanges of views.

SIFeb 12, 2019
RTbust: Exploiting Temporal Patterns for Botnet Detection on Twitter

Michele Mazza, Stefano Cresci, Marco Avvenuti et al.

Within OSNs, many of our supposedly online friends may instead be fake accounts called social bots, part of large groups that purposely re-share targeted content. Here, we study retweeting behaviors on Twitter, with the ultimate goal of detecting retweeting social bots. We collect a dataset of 10M retweets. We design a novel visualization that we leverage to highlight benign and malicious patterns of retweeting activity. In this way, we uncover a 'normal' retweeting pattern that is peculiar of human-operated accounts, and 3 suspicious patterns related to bot activities. Then, we propose a bot detection technique that stems from the previous exploration of retweeting behaviors. Our technique, called Retweet-Buster (RTbust), leverages unsupervised feature extraction and clustering. An LSTM autoencoder converts the retweet time series into compact and informative latent feature vectors, which are then clustered with a hierarchical density-based algorithm. Accounts belonging to large clusters characterized by malicious retweeting patterns are labeled as bots. RTbust obtains excellent detection results, with F1 = 0.87, whereas competitors achieve F1 < 0.76. Finally, we apply RTbust to a large dataset of retweets, uncovering 2 previously unknown active botnets with hundreds of accounts.

CYOct 14, 2015
Debunking in a World of Tribes

Fabiana Zollo, Alessandro Bessi, Michela Del Vicario et al.

Recently a simple military exercise on the Internet was perceived as the beginning of a new civil war in the US. Social media aggregate people around common interests eliciting a collective framing of narratives and worldviews. However, the wide availability of user-provided content and the direct path between producers and consumers of information often foster confusion about causations, encouraging mistrust, rumors, and even conspiracy thinking. In order to contrast such a trend attempts to \textit{debunk} are often undertaken. Here, we examine the effectiveness of debunking through a quantitative analysis of 54 million users over a time span of five years (Jan 2010, Dec 2014). In particular, we compare how users interact with proven (scientific) and unsubstantiated (conspiracy-like) information on Facebook in the US. Our findings confirm the existence of echo chambers where users interact primarily with either conspiracy-like or scientific pages. Both groups interact similarly with the information within their echo chamber. We examine 47,780 debunking posts and find that attempts at debunking are largely ineffective. For one, only a small fraction of usual consumers of unsubstantiated information interact with the posts. Furthermore, we show that those few are often the most committed conspiracy users and rather than internalizing debunking information, they often react to it negatively. Indeed, after interacting with debunking posts, users retain, or even increase, their engagement within the conspiracy echo chamber.

CYSep 1, 2015
Echo chambers in the age of misinformation

Michela Del Vicario, Alessandro Bessi, Fabiana Zollo et al.

The wide availability of user-provided content in online social media facilitates the aggregation of people around common interests, worldviews, and narratives. Despite the enthusiastic rhetoric on the part of some that this process generates "collective intelligence", the WWW also allows the rapid dissemination of unsubstantiated conspiracy theories that often elicite rapid, large, but naive social responses such as the recent case of Jade Helm 15 -- where a simple military exercise turned out to be perceived as the beginning of the civil war in the US. We study how Facebook users consume information related to two different kinds of narrative: scientific and conspiracy news. We find that although consumers of scientific and conspiracy stories present similar consumption patterns with respect to content, the sizes of the spreading cascades differ. Homogeneity appears to be the primary driver for the diffusion of contents, but each echo chamber has its own cascade dynamics. To mimic these dynamics, we introduce a data-driven percolation model on signed networks.

SIApr 20, 2015
Trend of Narratives in the Age of Misinformation

Alessandro Bessi, Fabiana Zollo, Michela Del Vicario et al.

Social media enabled a direct path from producer to consumer of contents changing the way users get informed, debate, and shape their worldviews. Such a {\em disintermediation} weakened consensus on social relevant issues in favor of rumors, mistrust, and fomented conspiracy thinking -- e.g., chem-trails inducing global warming, the link between vaccines and autism, or the New World Order conspiracy. In this work, we study through a thorough quantitative analysis how different conspiracy topics are consumed in the Italian Facebook. By means of a semi-automatic topic extraction strategy, we show that the most discussed contents semantically refer to four specific categories: {\em environment}, {\em diet}, {\em health}, and {\em geopolitics}. We find similar patterns by comparing users activity (likes and comments) on posts belonging to different semantic categories. However, if we focus on the lifetime -- i.e., the distance in time between the first and the last comment for each user -- we notice a remarkable difference within narratives -- e.g., users polarized on geopolitics are more persistent in commenting, whereas the less persistent are those focused on diet related topics. Finally, we model users mobility across various topics finding that the more a user is active, the more he is likely to join all topics. Once inside a conspiracy narrative users tend to embrace the overall corpus.

SIJan 28, 2015
Structural Patterns of the Occupy Movement on Facebook

Michela Del Vicario, Qian Zhang, Alessandro Bessi et al.

In this work we study a peculiar example of social organization on Facebook: the Occupy Movement -- i.e., an international protest movement against social and economic inequality organized online at a city level. We consider 179 US Facebook public pages during the time period between September 2011 and February 2013. The dataset includes 618K active users and 753K posts that received about 5.2M likes and 1.1M comments. By labeling user according to their interaction patterns on pages -- e.g., a user is considered to be polarized if she has at least the 95% of her likes on a specific page -- we find that activities are not locally coordinated by geographically close pages, but are driven by pages linked to major US cities that act as hubs within the various groups. Such a pattern is verified even by extracting the backbone structure -- i.e., filtering statistically relevant weight heterogeneities -- for both the pages-reshares and the pages-common users networks.

SIJan 28, 2015
Everyday the Same Picture: Popularity and Content Diversity

Alessandro Bessi, Fabiana Zollo, Michela Del Vicario et al.

Facebook is flooded by diverse and heterogeneous content, from kittens up to music and news, passing through satirical and funny stories. Each piece of that corpus reflects the heterogeneity of the underlying social background. In the Italian Facebook we have found an interesting case: a page having more than $40K$ followers that every day posts the same picture of a popular Italian singer. In this work, we use such a page as a control to study and model the relationship between content heterogeneity on popularity. In particular, we use that page for a comparative analysis of information consumption patterns with respect to pages posting science and conspiracy news. In total, we analyze about $2M$ likes and $190K$ comments, made by approximately $340K$ and $65K$ users, respectively. We conclude the paper by introducing a model mimicking users selection preferences accounting for the heterogeneity of contents.

SIAug 7, 2014
Science vs Conspiracy: collective narratives in the age of (mis)information

Alessandro Bessi, Mauro Coletto, George Alexandru Davidescu et al.

The large availability of user provided contents on online social media facilitates people aggregation around common interests, worldviews and narratives. However, in spite of the enthusiastic rhetoric about the so called {\em wisdom of crowds}, unsubstantiated rumors -- as alternative explanation to main stream versions of complex phenomena -- find on the Web a natural medium for their dissemination. In this work we study, on a sample of 1.2 million of individuals, how information related to very distinct narratives -- i.e. main stream scientific and alternative news -- are consumed on Facebook. Through a thorough quantitative analysis, we show that distinct communities with similar information consumption patterns emerge around distinctive narratives. Moreover, consumers of alternative news (mainly conspiracy theories) result to be more focused on their contents, while scientific news consumers are more prone to comment on alternative news. We conclude our analysis testing the response of this social system to 4709 troll information -- i.e. parodistic imitation of alternative and conspiracy theories. We find that, despite the false and satirical vein of news, usual consumers of conspiracy news are the most prone to interact with them.