Tiziano Piccardi

CL
h-index15
11papers
388citations
Novelty49%
AI Score46

11 Papers

HCMar 16
Value Alignment of Social Media Ranking Algorithms

Farnaz Jahanbakhsh, Dora Zhao, Tiziano Piccardi et al. · mit

While social media feed rankings are primarily driven by engagement signals rather than any explicit value system, the resulting algorithmic feeds are not value-neutral: engagement may prioritize specific individualistic values. This paper presents an approach for social media feed value alignment. We adopt Schwartz's theory of Basic Human Values -- a broad set of human values that articulates complementary and opposing values forming the building blocks of many cultures -- and we implement an algorithmic approach that models and then ranks feeds by expressions of Schwartz's values in social media posts. Our approach enables controls where users can express weights on their desired values, combining these weights and post value expressions into a ranking that respects users' articulated trade-offs. Through controlled experiments (N=141 and N=250), we demonstrate that users can use these controls to architect feeds reflecting their desired values. Across users, value-ranked feeds align with personal values, diverging substantially from existing engagement-driven feeds.

SIMar 20
Whose Values? Measuring the (Subjective) Expression of Basic Human Values in Social Media

Ziv Epstein, Farnaz Jahanbakhsh, Tiziano Piccardi et al. · mit

The value alignment of sociotechnical systems has become a central debate, but progress depends on how human values are perceived in the content these systems surface and how such perceptions can be measured at scale. Social media platforms are a prominent class of sociotechnical systems where algorithmic curation shapes exposure to value-laden content at scale. Large-language models offer new opportunities for measuring expressions of human values (e.g., humility or equality) in social media data, but value expressions can be subjective: different people will annotate the same post with different values. In this paper, we draw on the Schwartz value system as a broadly encompassing and theoretically grounded set of basic human values, and introduce a framework to personalize the measurement of expressions of Schwartz values in social media posts at scale. We collect 32,370 ground truth value expression annotations from N=1,079 people on 5,211 social media posts representative of real users' feeds. Due to the subjectivity of the task, we observe low levels of inter-rater agreement between people, and low agreement between human raters and LLM-based methods. In response, we construct a personalization architecture for classifying value expressions by learning from a small number of highly informative calibration annotations per user. In evaluation, we find that modeling these differences successfully yields value expression predictions that people agree with more than they agree with other people. These results contribute new methods and understanding for the measurement of human values in social media data.

CLOct 17, 2023
CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations

Myra Cheng, Tiziano Piccardi, Diyi Yang

Recent work has aimed to capture nuances of human behavior by using LLMs to simulate responses from particular demographics in settings like social science experiments and public opinion surveys. However, there are currently no established ways to discuss or evaluate the quality of such LLM simulations. Moreover, there is growing concern that these LLM simulations are flattened caricatures of the personas that they aim to simulate, failing to capture the multidimensionality of people and perpetuating stereotypes. To bridge these gaps, we present CoMPosT, a framework to characterize LLM simulations using four dimensions: Context, Model, Persona, and Topic. We use this framework to measure open-ended LLM simulations' susceptibility to caricature, defined via two criteria: individuation and exaggeration. We evaluate the level of caricature in scenarios from existing work on LLM simulations. We find that for GPT-4, simulations of certain demographics (political and marginalized groups) and topics (general, uncontroversial) are highly susceptible to caricature.

CYAug 31, 2023
In-class Data Analysis Replications: Teaching Students while Testing Science

Kristina Gligoric, Tiziano Piccardi, Jake Hofman et al.

Science is facing a reproducibility crisis. Previous work has proposed incorporating data analysis replications into classrooms as a potential solution. However, despite the potential benefits, it is unclear whether this approach is feasible, and if so, what the involved stakeholders-students, educators, and scientists-should expect from it. Can students perform a data analysis replication over the course of a class? What are the costs and benefits for educators? And how can this solution help benchmark and improve the state of science? In the present study, we incorporated data analysis replications in the project component of the Applied Data Analysis course (CS-401) taught at EPFL (N=354 students). Here we report pre-registered findings based on surveys administered throughout the course. First, we demonstrate that students can replicate previously published scientific papers, most of them qualitatively and some exactly. We find discrepancies between what students expect of data analysis replications and what they experience by doing them along with changes in expectations about reproducibility, which together serve as evidence of attitude shifts to foster students' critical thinking. Second, we provide information for educators about how much overhead is needed to incorporate replications into the classroom and identify concerns that replications bring as compared to more traditional assignments. Third, we identify tangible benefits of the in-class data analysis replications for scientific communities, such as a collection of replication reports and insights about replication barriers in scientific work that should be avoided going forward. Overall, we demonstrate that incorporating replication tasks into a large data science class can increase the reproducibility of scientific work as a by-product of data science instruction, thus benefiting both science and students.

CLSep 23, 2020Code
Crosslingual Topic Modeling with WikiPDA

Tiziano Piccardi, Robert West

We present Wikipedia-based Polyglot Dirichlet Allocation (WikiPDA), a crosslingual topic model that learns to represent Wikipedia articles written in any language as distributions over a common set of language-independent topics. It leverages the fact that Wikipedia articles link to each other and are mapped to concepts in the Wikidata knowledge base, such that, when represented as bags of links, articles are inherently language-independent. WikiPDA works in two steps, by first densifying bags of links using matrix completion and then training a standard monolingual topic model. A human evaluation shows that WikiPDA produces more coherent topics than monolingual text-based LDA, thus offering crosslinguality at no cost. We demonstrate WikiPDA's utility in two applications: a study of topical biases in 28 Wikipedia editions, and crosslingual supervised classification. Finally, we highlight WikiPDA's capacity for zero-shot language transfer, where a model is reused for new languages without any fine-tuning. Researchers can benefit from WikiPDA as a practical tool for studying Wikipedia's content across its 299 language editions in interpretable ways, via an easy-to-use library publicly available at https://github.com/epfl-dlab/WikiPDA.

CLFeb 3, 2024
AnthroScore: A Computational Linguistic Measure of Anthropomorphism

Myra Cheng, Kristina Gligoric, Tiziano Piccardi et al.

Anthropomorphism, or the attribution of human-like characteristics to non-human entities, has shaped conversations about the impacts and possibilities of technology. We present AnthroScore, an automatic metric of implicit anthropomorphism in language. We use a masked language model to quantify how non-human entities are implicitly framed as human by the surrounding context. We show that AnthroScore corresponds with human judgments of anthropomorphism and dimensions of anthropomorphism described in social science literature. Motivated by concerns of misleading anthropomorphism in computer science discourse, we use AnthroScore to analyze 15 years of research papers and downstream news articles. In research papers, we find that anthropomorphism has steadily increased over time, and that papers related to language models have the most anthropomorphism. Within ACL papers, temporal increases in anthropomorphism are correlated with key neural advancements. Building upon concerns of scientific misinformation in mass media, we identify higher levels of anthropomorphism in news headlines compared to the research papers they cite. Since AnthroScore is lexicon-free, it can be directly applied to a wide range of text sources.

CYNov 22, 2024
Social Media Algorithms Can Shape Affective Polarization via Exposure to Antidemocratic Attitudes and Partisan Animosity

Tiziano Piccardi, Martin Saveski, Chenyan Jia et al.

There is widespread concern about the negative impacts of social media feed ranking algorithms on political polarization. Leveraging advancements in large language models (LLMs), we develop an approach to re-rank feeds in real-time to test the effects of content that is likely to polarize: expressions of antidemocratic attitudes and partisan animosity (AAPA). In a preregistered 10-day field experiment on X/Twitter with 1,256 consented participants, we increase or decrease participants' exposure to AAPA in their algorithmically curated feeds. We observe more positive outparty feelings when AAPA exposure is decreased and more negative outparty feelings when AAPA exposure is increased. Exposure to AAPA content also results in an immediate increase in negative emotions, such as sadness and anger. The interventions do not significantly impact traditional engagement metrics such as re-post and favorite rates. These findings highlight a potential pathway for developing feed algorithms that mitigate affective polarization by addressing content that undermines the shared values required for a healthy democracy.

CLJan 10, 2022
Homepage2Vec: Language-Agnostic Website Embedding and Classification

Sylvain Lugeon, Tiziano Piccardi, Robert West

Currently, publicly available models for website classification do not offer an embedding method and have limited support for languages beyond English. We release a dataset of more than two million category-labeled websites in 92 languages collected from Curlie, the largest multilingual human-edited Web directory. The dataset contains 14 website categories aligned across languages. Alongside it, we introduce Homepage2Vec, a machine-learned pre-trained model for classifying and embedding websites based on their homepage in a language-agnostic way. Homepage2Vec, thanks to its feature set (textual content, metadata tags, and visual attributes) and recent progress in natural language representation, is language-independent by design and generates embedding-based representations. We show that Homepage2Vec correctly classifies websites with a macro-averaged F1-score of 0.90, with stable performance across low- as well as high-resource languages. Feature analysis shows that a small subset of efficiently computable features suffices to achieve high performance even with limited computational resources. We make publicly available the curated Curlie dataset aligned across languages, the pre-trained Homepage2Vec model, and libraries

IRApr 17, 2018
Structuring Wikipedia Articles with Section Recommendations

Tiziano Piccardi, Michele Catasta, Leila Zia et al.

Sections are the building blocks of Wikipedia articles. They enhance readability and can be used as a structured entry point for creating and expanding articles. Structuring a new or already existing Wikipedia article with sections is a hard task for humans, especially for newcomers or less experienced editors, as it requires significant knowledge about how a well-written article looks for each possible topic. Inspired by this need, the present paper defines the problem of section recommendation for Wikipedia articles and proposes several approaches for tackling it. Our systems can help editors by recommending what sections to add to already existing or newly created Wikipedia articles. Our basic paradigm is to generate recommendations by sourcing sections from articles that are similar to the input article. We explore several ways of defining similarity for this purpose (based on topic modeling, collaborative filtering, and Wikipedia's category system). We use both automatic and human evaluation approaches for assessing the performance of our recommendation system, concluding that the category-based approach works best, achieving precision@10 of about 80% in the human evaluation.

SIApr 7, 2018
Quootstrap: Scalable Unsupervised Extraction of Quotation-Speaker Pairs from Large News Corpora via Bootstrapping

Dario Pavllo, Tiziano Piccardi, Robert West

We propose Quootstrap, a method for extracting quotations, as well as the names of the speakers who uttered them, from large news corpora. Whereas prior work has addressed this problem primarily with supervised machine learning, our approach follows a fully unsupervised bootstrapping paradigm. It leverages the redundancy present in large news corpora, more precisely, the fact that the same quotation often appears across multiple news articles in slightly different contexts. Starting from a few seed patterns, such as ["Q", said S.], our method extracts a set of quotation-speaker pairs (Q, S), which are in turn used for discovering new patterns expressing the same quotations; the process is then repeated with the larger pattern set. Our algorithm is highly scalable, which we demonstrate by running it on the large ICWSM 2011 Spinn3r corpus. Validating our results against a crowdsourced ground truth, we obtain 90% precision at 40% recall using a single seed pattern, with significantly higher recall values for more frequently reported (and thus likely more interesting) quotations. Finally, we showcase the usefulness of our algorithm's output for computational social science by analyzing the sentiment expressed in our extracted quotations.

HCJul 22, 2014
Assessing the Performance of Question-and-Answer Communities Using Survival Analysis

Felipe Ortega, Gregorio Convertino, Massimo Zancanaro et al.

Question-&-Answer (QA) websites have emerged as efficient platforms for knowledge sharing and problem solving. In particular, the Stack Exchange platform includes some of the most popular QA communities to date, such as Stack Overflow. Initial metrics used to assess the performance of these communities include summative statistics like the percentage of resolved questions or the average time to receive and validate correct answers. However, more advanced methods for longitudinal data analysis can provide further insights on the QA process, by enabling identification of key predictive factors and systematic comparison of performance across different QA communities. In this paper, we apply survival analysis to a selection of communities from the Stack Exchange platform. We illustrate the advantages of using the proposed methodology to characterize and evaluate the performance of QA communities, and then point to some implications for the design and management of QA platforms.