Serena Villata

CL
h-index37
15papers
194citations
Novelty33%
AI Score48

15 Papers

CLMar 3
Compact Prompting in Instruction-tuned LLMs for Joint Argumentative Component Detection

Sofiane Elguendouze, Erwan Hain, Elena Cabrio et al.

Argumentative component detection (ACD) is a core subtask of Argument(ation) Mining (AM) and one of its most challenging aspects, as it requires jointly delimiting argumentative spans and classifying them into components such as claims and premises. While research on this subtask remains relatively limited compared to other AM tasks, most existing approaches formulate it as a simplified sequence labeling problem, component classification, or a pipeline of component segmentation followed by classification. In this paper, we propose a novel approach based on instruction-tuned Large Language Models (LLMs) using compact instruction-based prompts, and reframe ACD as a language generation task, enabling arguments to be identified directly from plain text without relying on pre-segmented components. Experiments on standard benchmarks show that our approach achieves higher performance compared to state-of-the-art systems. To the best of our knowledge, this is one of the first attempts to fully model ACD as a generative task, highlighting the potential of instruction tuning for complex AM problems.

CLFeb 19
PEACE 2.0: Grounded Explanations and Counter-Speech for Combating Hate Expressions

Greta Damo, Stéphane Petiot, Elena Cabrio et al.

The increasing volume of hate speech on online platforms poses significant societal challenges. While the Natural Language Processing community has developed effective methods to automatically detect the presence of hate speech, responses to it, called counter-speech, are still an open challenge. We present PEACE 2.0, a novel tool that, besides analysing and explaining why a message is considered hateful or not, also generates a response to it. More specifically, PEACE 2.0 has three main new functionalities: leveraging a Retrieval-Augmented Generation (RAG) pipeline i) to ground HS explanations into evidence and facts, ii) to automatically generate evidence-grounded counter-speech, and iii) exploring the characteristics of counter-speech replies. By integrating these capabilities, PEACE 2.0 enables in-depth analysis and response generation for both explicit and implicit hateful messages.

CLApr 11, 2024Code
Medical mT5: An Open-Source Multilingual Text-to-Text LLM for The Medical Domain

Iker García-Ferrero, Rodrigo Agerri, Aitziber Atutxa Salazar et al.

Research on language technology for the development of medical applications is currently a hot topic in Natural Language Understanding and Generation. Thus, a number of large language models (LLMs) have recently been adapted to the medical domain, so that they can be used as a tool for mediating in human-AI interaction. While these LLMs display competitive performance on automated medical texts benchmarks, they have been pre-trained and evaluated with a focus on a single language (English mostly). This is particularly true of text-to-text models, which typically require large amounts of domain-specific pre-training data, often not easily accessible for many languages. In this paper, we address these shortcomings by compiling, to the best of our knowledge, the largest multilingual corpus for the medical domain in four languages, namely English, French, Italian and Spanish. This new corpus has been used to train Medical mT5, the first open-source text-to-text multilingual model for the medical domain. Additionally, we present two new evaluation benchmarks for all four languages with the aim of facilitating multilingual research in this domain. A comprehensive evaluation shows that Medical mT5 outperforms both encoders and similarly sized text-to-text models for the Spanish, French, and Italian benchmarks, while being competitive with current state-of-the-art LLMs in English.

LGApr 10, 2025Code
Merging Embedded Topics with Optimal Transport for Online Topic Modeling on Data Streams

Federica Granese, Benjamin Navet, Serena Villata et al.

Topic modeling is a key component in unsupervised learning, employed to identify topics within a corpus of textual data. The rapid growth of social media generates an ever-growing volume of textual data daily, making online topic modeling methods essential for managing these data streams that continuously arrive over time. This paper introduces a novel approach to online topic modeling named StreamETM. This approach builds on the Embedded Topic Model (ETM) to handle data streams by merging models learned on consecutive partial document batches using unbalanced optimal transport. Additionally, an online change point detection algorithm is employed to identify shifts in topics over time, enabling the identification of significant changes in the dynamics of text streams. Numerical experiments on simulated and real-world data show StreamETM outperforming competitors. We provide the code publicly available at https://github.com/fgranese/StreamETM.

CLMar 24, 2024
Argument Quality Assessment in the Age of Instruction-Following Large Language Models

Henning Wachsmuth, Gabriella Lapesa, Elena Cabrio et al.

The computational treatment of arguments on controversial issues has been subject to extensive NLP research, due to its envisioned impact on opinion formation, decision making, writing education, and the like. A critical task in any such application is the assessment of an argument's quality - but it is also particularly challenging. In this position paper, we start from a brief survey of argument quality research, where we identify the diversity of quality notions and the subjectiveness of their perception as the main hurdles towards substantial progress on argument quality assessment. We argue that the capabilities of instruction-following large language models (LLMs) to leverage knowledge across contexts enable a much more reliable assessment. Rather than just fine-tuning LLMs towards leaderboard chasing on assessment tasks, they need to be instructed systematically with argumentation theories and scenarios as well as with ways to solve argument-related problems. We discuss the real-world opportunities and ethical issues emerging thereby.

CLJun 13, 2025
Effectiveness of Counter-Speech against Abusive Content: A Multidimensional Annotation and Classification Study

Greta Damo, Elena Cabrio, Serena Villata

Counter-speech (CS) is a key strategy for mitigating online Hate Speech (HS), yet defining the criteria to assess its effectiveness remains an open challenge. We propose a novel computational framework for CS effectiveness classification, grounded in linguistics, communication and argumentation concepts. Our framework defines six core dimensions - Clarity, Evidence, Emotional Appeal, Rebuttal, Audience Adaptation, and Fairness - which we use to annotate 4,214 CS instances from two benchmark datasets, resulting in a novel linguistic resource released to the community. In addition, we propose two classification strategies, multi-task and dependency-based, achieving strong results (0.94 and 0.96 average F1 respectively on both expert- and user-written CS), outperforming standard baselines, and revealing strong interdependence among dimensions.

LGOct 21, 2025
Stick-Breaking Embedded Topic Model with Continuous Optimal Transport for Online Analysis of Document Streams

Federica Granese, Serena Villata, Charles Bouveyron

Online topic models are unsupervised algorithms to identify latent topics in data streams that continuously evolve over time. Although these methods naturally align with real-world scenarios, they have received considerably less attention from the community compared to their offline counterparts, due to specific additional challenges. To tackle these issues, we present SB-SETM, an innovative model extending the Embedded Topic Model (ETM) to process data streams by merging models formed on successive partial document batches. To this end, SB-SETM (i) leverages a truncated stick-breaking construction for the topic-per-document distribution, enabling the model to automatically infer from the data the appropriate number of active topics at each timestep; and (ii) introduces a merging strategy for topic embeddings based on a continuous formulation of optimal transport adapted to the high dimensionality of the latent topic space. Numerical experiments show SB-SETM outperforming baselines on simulated scenarios. We extensively test it on a real-world corpus of news articles covering the Russian-Ukrainian war throughout 2022-2023.

CLOct 14, 2025
Beating Harmful Stereotypes Through Facts: RAG-based Counter-speech Generation

Greta Damo, Elena Cabrio, Serena Villata

Counter-speech generation is at the core of many expert activities, such as fact-checking and hate speech, to counter harmful content. Yet, existing work treats counter-speech generation as pure text generation task, mainly based on Large Language Models or NGO experts. These approaches show severe drawbacks due to the limited reliability and coherence in the generated countering text, and in scalability, respectively. To close this gap, we introduce a novel framework to model counter-speech generation as knowledge-wise text generation process. Our framework integrates advanced Retrieval-Augmented Generation (RAG) pipelines to ensure the generation of trustworthy counter-speech for 8 main target groups identified in the hate speech literature, including women, people of colour, persons with disabilities, migrants, Muslims, Jews, LGBT persons, and other. We built a knowledge base over the United Nations Digital Library, EUR-Lex and the EU Agency for Fundamental Rights, comprising a total of 32,792 texts. We use the MultiTarget-CONAN dataset to empirically assess the quality of the generated counter-speech, both through standard metrics (i.e., JudgeLM) and a human evaluation. Results show that our framework outperforms standard LLM baselines and competitive approach, on both assessments. The resulting framework and the knowledge base pave the way for studying trustworthy and sound counter-speech generation, in hate speech and beyond.

CLAug 5, 2025
RooseBERT: A New Deal For Political Language Modelling

Deborah Dore, Elena Cabrio, Serena Villata

The increasing amount of political debates and politics-related discussions calls for the definition of novel computational methods to automatically analyse such content with the final goal of lightening up political deliberation to citizens. However, the specificity of the political language and the argumentative form of these debates (employing hidden communication strategies and leveraging implicit arguments) make this task very challenging, even for current general-purpose pre-trained Language Models. To address this issue, we introduce a novel pre-trained Language Model for political discourse language called RooseBERT. Pre-training a language model on a specialised domain presents different technical and linguistic challenges, requiring extensive computational resources and large-scale data. RooseBERT has been trained on large political debate and speech corpora (8K debates, each composed of several sub-debates on different topics) in English. To evaluate its performances, we fine-tuned it on four downstream tasks related to political debate analysis, i.e., stance detection, sentiment analysis, argument component detection and classification, and argument relation prediction and classification. Our results demonstrate significant improvements over general-purpose Language Models on these four tasks, highlighting how domain-specific pre-training enhances performance in political debate analysis. We release RooseBERT for the research community.

CLFeb 14, 2020
A Dataset Independent Set of Baselines for Relation Prediction in Argument Mining

Oana Cocarascu, Elena Cabrio, Serena Villata et al.

Argument Mining is the research area which aims at extracting argument components and predicting argumentative relations (i.e.,support and attack) from text. In particular, numerous approaches have been proposed in the literature to predict the relations holding between the arguments, and application-specific annotated resources were built for this purpose. Despite the fact that these resources have been created to experiment on the same task, the definition of a single relation prediction method to be successfully applied to a significant portion of these datasets is an open research problem in Argument Mining. This means that none of the methods proposed in the literature can be easily ported from one resource to another. In this paper, we address this problem by proposing a set of dataset independent strong neural baselines which obtain homogeneous results on all the datasets proposed in the literature for the argumentative relation prediction task. Thus, our baselines can be employed by the Argument Mining community to compare more effectively how well a method performs on the argumentative relation prediction task.

DBMar 11, 2019
Graph Data on the Web: extend the pivot, don't reinvent the wheel

Fabien Gandon, Franck Michel, Olivier Corby et al.

This article is a collective position paper from the Wimmics research team, expressing our vision of how Web graph data technologies should evolve in the future in order to ensure a high-level of interoperability between the many types of applications that produce and consume graph data. Wimmics stands for Web-Instrumented Man-Machine Interactions, Communities, and Semantics. We are a joint research team between INRIA Sophia Antipolis-M{é}diterran{é}e and I3S (CNRS and Universit{é} C{ô}te d'Azur). Our challenge is to bridge formal semantics and social semantics on the web. Our research areas are graph-oriented knowledge representation, reasoning and operationalization to model and support actors, actions and interactions in web-based epistemic communities. The application of our research is supporting and fostering interactions in online communities and management of their resources. In this position paper, we emphasize the need to extend the semantic Web standard stack to address and fulfill new graph data needs, as well as the importance of remaining compatible with existing recommendations, in particular the RDF stack, to avoid the painful duplication of models, languages, frameworks, etc. The following sections group motivations for different directions of work and collect reasons for the creation of a working group on RDF 2.0 and other recommendations of the RDF family.

AISep 23, 2017
Prioritized Norms in Formal Argumentation

Beishui Liao, Nir Oren, Leendert van der Torre et al.

To resolve conflicts among norms, various nonmonotonic formalisms can be used to perform prioritized normative reasoning. Meanwhile, formal argumentation provides a way to represent nonmonotonic logics. In this paper, we propose a representation of prioritized normative reasoning by argumentation. Using hierarchical abstract normative systems, we define three kinds of prioritized normative reasoning approaches, called Greedy, Reduction, and Optimization. Then, after formulating an argumentation theory for a hierarchical abstract normative system, we show that for a totally ordered hierarchical abstract normative system, Greedy and Reduction can be represented in argumentation by applying the weakest link and the last link principles respectively, and Optimization can be represented by introducing additional defeats capturing the idea that for each argument that contains a norm not belonging to the maximal obeyable set then this argument should be rejected.

AIOct 19, 2015
System Descriptions of the First International Competition on Computational Models of Argumentation (ICCMA'15)

Matthias Thimm, Serena Villata

This volume contains the system description of the 18 solvers submitted to the First International Competition on Computational Models of Argumentation (ICCMA'15) and therefore gives an overview on state-of-the-art of computational approaches to abstract argumentation problems. Further information on the results of the competition and the performance of the individual solvers can be found on at http://argumentationcompetition.org/2015/.

AIAug 29, 2014
Challenges in Bridging Social Semantics and Formal Semantics on the Web

Fabien Lucien Gandon, Michel Buffa, Elena Cabrio et al.

This paper describes several results of Wimmics, a research lab which names stands for: web-instrumented man-machine interactions, communities, and semantics. The approaches introduced here rely on graph-oriented knowledge representation, reasoning and operationalization to model and support actors, actions and interactions in web-based epistemic communities. The re-search results are applied to support and foster interactions in online communities and manage their resources.

AIMay 5, 2014
Towards a Benchmark of Natural Language Arguments

Elena Cabrio, Serena Villata

The connections among natural language processing and argumentation theory are becoming stronger in the latest years, with a growing amount of works going in this direction, in different scenarios and applying heterogeneous techniques. In this paper, we present two datasets we built to cope with the combination of the Textual Entailment framework and bipolar abstract argumentation. In our approach, such datasets are used to automatically identify through a Textual Entailment system the relations among the arguments (i.e., attack, support), and then the resulting bipolar argumentation graphs are analyzed to compute the accepted arguments.