Giovanni Sartor

AI
h-index34
11papers
270citations
Novelty21%
AI Score42

11 Papers

CLAug 4, 2023
Legal Summarisation through LLMs: The PRODIGIT Project

Thiago Dal Pont, Federico Galli, Andrea Loreggia et al.

We present some initial results of a large-scale Italian project called PRODIGIT which aims to support tax judges and lawyers through digital technology, focusing on AI. We have focused on generation of summaries of judicial decisions and on the extraction of related information, such as the identification of legal issues and decision-making criteria, and the specification of keywords. To this end, we have deployed and evaluated different tools and approaches to extractive and abstractive summarisation. We have applied LLMs, and particularly on GPT4, which has enabled us to obtain results that proved satisfactory, according to an evaluation by expert tax judges and lawyers. On this basis, a prototype application is being built which will be made publicly available.

AIOct 20, 2022
Modelling and Explaining Legal Case-based Reasoners through Classifiers

Xinghan Liu, Emiliano Lorini, Antonino Rotolo et al.

This paper brings together two lines of research: factor-based models of case-based reasoning (CBR) and the logical specification of classifiers. Logical approaches to classifiers capture the connection between features and outcomes in classifier systems. Factor-based reasoning is a popular approach to reasoning by precedent in AI & Law. Horty (2011) has developed the factor-based models of precedent into a theory of precedential constraint. In this paper we combine the modal logic approach (binary-input classifier, BLC) to classifiers and their explanations given by Liu & Lorini (2021) with Horty's account of factor-based CBR, since both a classifier and CBR map sets of features to decisions or classifications. We reformulate case bases of Horty in the language of BCL, and give several representation results. Furthermore, we show how notions of CBR, e.g. reason, preference between reasons, can be analyzed by notions of classifier system.

AIJul 11, 2023
Stable Normative Explanations: From Argumentation to Deontic Logic

Cecilia Di Florio, Guido Governatori, Antonino Rotolo et al.

This paper examines how a notion of stable explanation developed elsewhere in Defeasible Logic can be expressed in the context of formal argumentation. With this done, we discuss the deontic meaning of this reconstruction and show how to build from argumentation neighborhood structures for deontic logic where this notion of explanation can be characterised. Some direct complexity results are offered.

CLDec 4, 2025
Challenging the Abilities of Large Language Models in Italian: a Community Initiative

Malvina Nissim, Danilo Croce, Viviana Patti et al.

The rapid progress of Large Language Models (LLMs) has transformed natural language processing and broadened its impact across research and society. Yet, systematic evaluation of these models, especially for languages beyond English, remains limited. "Challenging the Abilities of LAnguage Models in ITAlian" (CALAMITA) is a large-scale collaborative benchmarking initiative for Italian, coordinated under the Italian Association for Computational Linguistics. Unlike existing efforts that focus on leaderboards, CALAMITA foregrounds methodology: it federates more than 80 contributors from academia, industry, and the public sector to design, document, and evaluate a diverse collection of tasks, covering linguistic competence, commonsense reasoning, factual consistency, fairness, summarization, translation, and code generation. Through this process, we not only assembled a benchmark of over 20 tasks and almost 100 subtasks, but also established a centralized evaluation pipeline that supports heterogeneous datasets and metrics. We report results for four open-weight LLMs, highlighting systematic strengths and weaknesses across abilities, as well as challenges in task-specific evaluation. Beyond quantitative results, CALAMITA exposes methodological lessons: the necessity of fine-grained, task-representative metrics, the importance of harmonized pipelines, and the benefits and limitations of broad community engagement. CALAMITA is conceived as a rolling benchmark, enabling continuous integration of new tasks and models. This makes it both a resource -- the most comprehensive and diverse benchmark for Italian to date -- and a framework for sustainable, community-driven evaluation. We argue that this combination offers a blueprint for other languages and communities seeking inclusive and rigorous LLM evaluation practices.

CLOct 8, 2025
Towards Reliable Retrieval in RAG Systems for Large Legal Datasets

Markus Reuter, Tobias Lingenberg, Rūta Liepiņa et al.

Retrieval-Augmented Generation (RAG) is a promising approach to mitigate hallucinations in Large Language Models (LLMs) for legal applications, but its reliability is critically dependent on the accuracy of the retrieval step. This is particularly challenging in the legal domain, where large databases of structurally similar documents often cause retrieval systems to fail. In this paper, we address this challenge by first identifying and quantifying a critical failure mode we term Document-Level Retrieval Mismatch (DRM), where the retriever selects information from entirely incorrect source documents. To mitigate DRM, we investigate a simple and computationally efficient technique which we refer to as Summary-Augmented Chunking (SAC). This method enhances each text chunk with a document-level synthetic summary, thereby injecting crucial global context that would otherwise be lost during a standard chunking process. Our experiments on a diverse set of legal information retrieval tasks show that SAC greatly reduces DRM and, consequently, also improves text-level retrieval precision and recall. Interestingly, we find that a generic summarization strategy outperforms an approach that incorporates legal expert domain knowledge to target specific legal elements. Our work provides evidence that this practical, scalable, and easily integrable technique enhances the reliability of RAG systems when applied to large-scale legal document datasets.

CYMay 10, 2025
Getting Ready for the EU AI Act in Healthcare. A call for Sustainable AI Development and Deployment

John Brandt Brodersen, Ilaria Amelia Caggiano, Pedro Kringen et al.

Assessments of trustworthiness have become a cornerstone of responsible AI development. Especially in high-stakes fields like healthcare, aligning technical, evidence-based, and ethical practices with forthcoming legal requirements is increasingly urgent. We argue that developers and deployers of AI systems for the medical domain should be proactive and take steps to progressively ensure that such systems, both those currently in use and those being developed or planned, respect the requirements of the AI Act, which has come into force in August 2024. This is necessary if full and effective compliance is to be ensured when the most relevant provisions of the Act become effective (August 2026). The engagement with the AI Act cannot be viewed as a formalistic exercise. Compliance with the AI Act needs to be carried out through the proactive commitment to the ethical principles of trustworthy AI. These principles provide the background for the Act, which mentions them several times and connects them to the protection of public interest. They can be used to interpret and apply the Act's provisions and to identify good practices, increasing the validity and sustainability of AI systems over time.

AIMar 5
Legal interpretation and AI: from expert systems to argumentation and LLMs

Václav Janeček, Giovanni Sartor

AI and Law research has encountered legal interpretation in different ways, in the context of its evolving approaches and methodologies. Research on expert system has focused on legal knowledge engineering, with the goal of ensuring that human-generated interpretations can be precisely transferred into knowledge-bases, to be consistently applied. Research on argumentation has aimed at representing the structure of interpretive arguments, as well as their dialectical interactions, to assess of the acceptability of interpretive claims within argumentation frameworks. Research on machine learning has focused on the automated generation of interpretive suggestions and arguments, through general and specialised language models, now being increasingly deployed in legal practice.

AIJul 24, 2025
Foundations for Risk Assessment of AI in Protecting Fundamental Rights

Antonino Rotolo, Beatrice Ferrigno, Jose Miguel Angel Garcia Godinez et al.

This chapter introduces a conceptual framework for qualitative risk assessment of AI, particularly in the context of the EU AI Act. The framework addresses the complexities of legal compliance and fundamental rights protection by itegrating definitional balancing and defeasible reasoning. Definitional balancing employs proportionality analysis to resolve conflicts between competing rights, while defeasible reasoning accommodates the dynamic nature of legal decision-making. Our approach stresses the need for an analysis of AI deployment scenarios and for identifying potential legal violations and multi-layered impacts on fundamental rights. On the basis of this analysis, we provide philosophical foundations for a logical account of AI risk analysis. In particular, we consider the basic building blocks for conceptually grasping the interaction between AI deployment scenarios and fundamental rights, incorporating in defeasible reasoning definitional balancing and arguments about the contextual promotion or demotion of rights. This layered approach allows for more operative models of assessment of both high-risk AI systems and General Purpose AI (GPAI) systems, emphasizing the broader applicability of the latter. Future work aims to develop a formal model and effective algorithms to enhance AI risk assessment, bridging theoretical insights with practical applications to support responsible AI governance.

AIMar 23, 2020
Modeling Contrary-to-Duty with CP-nets

Roberta Calegari, Andrea Loreggia, Emiliano Lorini et al.

In a ceteris-paribus semantics for deontic logic, a state of affairs where a larger set of prescriptions is respected is preferable to a state of affairs where some of them are violated. Conditional preference nets (CP-nets) are a compact formalism to express and analyse ceteris paribus preferences, which nice computational properties. This paper shows how deontic concepts can be captured through conditional preference models. A restricted deontic logic will be defined, and mapped into conditional preference nets. We shall also show how to model contrary to duties obligations in CP-nets and how to capture in this formalism the distinction between strong and weak permission.

AIMay 3, 2018
CLAUDETTE: an Automated Detector of Potentially Unfair Clauses in Online Terms of Service

Marco Lippi, Przemyslaw Palka, Giuseppe Contissa et al.

Terms of service of on-line platforms too often contain clauses that are potentially unfair to the consumer. We present an experimental study where machine learning is employed to automatically detect such potentially unfair clauses. Results show that the proposed system could provide a valuable tool for lawyers and consumers alike.

AIAug 1, 2017
A Labelling Framework for Probabilistic Argumentation

Regis Riveret, Pietro Baroni, Yang Gao et al.

The combination of argumentation and probability paves the way to new accounts of qualitative and quantitative uncertainty, thereby offering new theoretical and applicative opportunities. Due to a variety of interests, probabilistic argumentation is approached in the literature with different frameworks, pertaining to structured and abstract argumentation, and with respect to diverse types of uncertainty, in particular the uncertainty on the credibility of the premises, the uncertainty about which arguments to consider, and the uncertainty on the acceptance status of arguments or statements. Towards a general framework for probabilistic argumentation, we investigate a labelling-oriented framework encompassing a basic setting for rule-based argumentation and its (semi-) abstract account, along with diverse types of uncertainty. Our framework provides a systematic treatment of various kinds of uncertainty and of their relationships and allows us to back or question assertions from the literature.