18.6AIJun 3
Tree-Based Formalization of Multi-Agent Complementarity in Human-AI InteractionsAndrea Ferrario
Complementarity is the case in which a human--AI interaction (HAI) outperforms the best prediction benchmark available among its members. Although this idea is central in HAI research, formal work on complementarity remains limited. Existing frameworks do not model how agents' predictions compose into workflow-sensitive multi-agent protocols. We close this gap by introducing a tree-based formalization of complementarity in multi-agent HAI. An HAI protocol is represented by an ordered agent-role configuration together with a rooted planar binary tree whose leaves are decorated by prediction vectors. A local binary composition rule is evaluated recursively along the tree, yielding a tree-relative complementarity functional relative to a pointwise-min oracle benchmark. We prove four results. First, selector-based HAIs, including self- or AI-reliance, cannot achieve complementarity regardless of task, loss, or prediction quality. Second, in regression under squared loss, complementarity is equivalent to Euclidean distance minimization from the ground-truth vector; for $N=2$, the optimal linear-pooling weight has a closed form and a residual-correction interpretation. Third, under linear local composition, every protocol tree defines a barycentric coordinate chart on the simplex of leaf weights; Tamari-cover reparameterizations of protocol trees preserve complementarity, and for $N=4$, they satisfy the pentagon identity. Fourth, in binary classification, no internal local composition can achieve complementarity under endpoint-monotone losses, including standard Bregman and many finite Bernoulli $f$-divergence losses; an analogous obstruction holds for multiclass aggregation under cross-entropy. In summary, our framework shows that complementarity is attainable in multi-agent regression, but obstructed in classification under natural conditions on local aggregation and loss functions.
54.4CYApr 27
Update Opacity: Epistemic Accessibility and Governance Under AI System ChangeAndrea Ferrario, Joshua Hatherley
Machine learning models embedded in deployed AI systems are routinely updated to maintain correct functioning over time. Yet such updates can generate update opacity: users may not be able to understand why the same input now yields a different output. We argue that update opacity is best understood as a diachronic failure of epistemic accessibility: the problem is that materially relevant changes may fail to remain accessible to human users in forms that support understanding, calibrated reliance, and appropriate action under real role- and time-specific constraints. This makes update opacity a governance problem. Not all change is equally relevant, and disclosing every update would itself undermine use through overload. To address this problem, we combine two complementary governance approaches: the EU AI Act, which helps specify the system-level perimeter of normatively relevant change, and Machine Learning Operations, which provides operational tools for tracking and comparing change over time. On this basis, we propose a framework that models system change through trustworthiness profiles and trustworthiness levels, and uses threshold-based disclosure to surface materially relevant within-envelope change to different stakeholders over time. We illustrate the approach with a medical AI example and derive practical implications for lifecycle documentation, post-market monitoring, and update disclosure.
46.9SEApr 21
Beyond the 'Diff': Addressing Agentic Entropy in Agentic Software DevelopmentMatteo Casserini, Alessandro Facchini, Andrea Ferrario
As autonomous coding agents become deeply embedded in software development workflows, their high operational velocity introduces a critical oversight challenge: the accumulating divergence between agentic actions and architectural intent. We term this process agentic entropy: a systemic drift that traditional code diff-based and HCXAI methods fail to capture, as they address local outputs rather than global agentic behaviour. To close this gap, we propose a process-oriented explainability framework that exposes how agentic decisions unfold across time, tool calls, and architectural boundaries. Built around three pillars (conformity seeding, reasoning monitoring, and a causal graph interface) our approach provides intent-level telemetry that complements, rather than replaces, existing review practices. We demonstrate its relevance across two user profiles: lay users engaged in vibe coding, who gain structural visibility otherwise masked by functional success; and professional developers, who gain richer contextual grounding for code review without increased overhead. By treating cognitive drift as a first-class concern alongside code quality, our framework supports the minimum level of human comprehension required for agentic oversight to remain substantive.
AIMar 26, 2024
Addressing Social Misattributions of Large Language Models: An HCXAI-based ApproachAndrea Ferrario, Alberto Termine, Alessandro Facchini
Human-centered explainable AI (HCXAI) advocates for the integration of social aspects into AI explanations. Central to the HCXAI discourse is the Social Transparency (ST) framework, which aims to make the socio-organizational context of AI systems accessible to their users. In this work, we suggest extending the ST framework to address the risks of social misattributions in Large Language Models (LLMs), particularly in sensitive areas like mental health. In fact LLMs, which are remarkably capable of simulating roles and personas, may lead to mismatches between designers' intentions and users' perceptions of social attributes, risking to promote emotional manipulation and dangerous behaviors, cases of epistemic injustice, and unwarranted trust. To address these issues, we propose enhancing the ST framework with a fifth 'W-question' to clarify the specific social attributions assigned to LLMs by its designers and users. This addition aims to bridge the gap between LLM capabilities and user perceptions, promoting the ethically responsible development and use of LLM-based technology.
52.0CYApr 17
High-Risk AI Systems and the Problem of Identity in the European AI ActAndrea Ferrario
The EU Artificial Intelligence Act (AIA) establishes a lifecycle governance regime for high-risk AI systems built around ex-ante conformity assessment, post-market monitoring, and re-assessment upon "substantial modification." These obligations presuppose AI identity judgments: regulators and providers must decide when an updated system remains the same system over time. In this work, we show how this logic is clarified by the function+ framework of artifact identity, which individuates AI systems by their intended function together with context-sensitive criteria of appropriate functioning, captured as "AI trustworthiness." We further argue that the AIA does not provide an internal, auditable criterion for synchronic identity--when two AI systems at a given time should count as the same for regulatory purposes--and instead largely defers such sameness determinations to sectoral or harmonization instruments. function+ supplies a synchronic identity test anchored in intended function and trustworthiness profiles and levels, making synchronic identity decisions inspectable in governance settings such as procurement, liability, and market surveillance. Our contribution is a conceptual and auditing lens: we provide a correspondence map between AIA lifecycle obligations and function+ identity components, and we make the synchronic case operationally legible via a minimal decision flow for audit and dispute contexts. We conclude with two implementation-facing recommendations: (1) more precise, testable reporting of intended purpose, and (2) standardized, auditable trustworthiness reporting that supports comparability over time and across deployments.
AIJun 3, 2025
A Trustworthiness-based Metaphysics of Artificial Intelligence SystemsAndrea Ferrario
Modern AI systems are man-made objects that leverage machine learning to support our lives across a myriad of contexts and applications. Despite extensive epistemological and ethical debates, their metaphysical foundations remain relatively under explored. The orthodox view simply suggests that AI systems, as artifacts, lack well-posed identity and persistence conditions -- their metaphysical kinds are no real kinds. In this work, we challenge this perspective by introducing a theory of metaphysical identity of AI systems. We do so by characterizing their kinds and introducing identity criteria -- formal rules that answer the questions "When are two AI systems the same?" and "When does an AI system persist, despite change?" Building on Carrara and Vermaas' account of fine-grained artifact kinds, we argue that AI trustworthiness provides a lens to understand AI system kinds and formalize the identity of these artifacts by relating their functional requirements to their physical make-ups. The identity criteria of AI systems are determined by their trustworthiness profiles -- the collection of capabilities that the systems must uphold over time throughout their artifact histories, and their effectiveness in maintaining these capabilities. Our approach suggests that the identity and persistence of AI systems is sensitive to the socio-technical context of their design and utilization via their trustworthiness, providing a solid metaphysical foundation to the epistemological, ethical, and legal discussions about these artifacts.
AIJan 14
Epistemology gives a Future to Complementarity in Human-AI InteractionsAndrea Ferrario, Alessandro Facchini, Juan M. Durán
Human-AI complementarity is the claim that a human supported by an AI system can outperform either alone in a decision-making process. Since its introduction in the human-AI interaction literature, it has gained traction by generalizing the reliance paradigm and by offering a more practical alternative to the contested construct of 'trust in AI.' Yet complementarity faces key theoretical challenges: it lacks precise theoretical anchoring, it is formalized just as a post hoc indicator of relative predictive accuracy, it remains silent about other desiderata of human-AI interactions and it abstracts away from the magnitude-cost profile of its performance gain. As a result, complementarity is difficult to obtain in empirical settings. In this work, we leverage epistemology to address these challenges by reframing complementarity within the discourse on justificatory AI. Drawing on computational reliabilism, we argue that historical instances of complementarity function as evidence that a given human-AI interaction is a reliable epistemic process for a given predictive task. Together with other reliability indicators assessing the alignment of the human-AI team with the epistemic standards and socio-technical practices, complementarity contributes to the degree of reliability of human-AI teams when generating predictions. This supports the practical reasoning of those affected by these outputs -- patients, managers, regulators, and others. In summary, our approach suggests that the role and value of complementarity lies not in providing a relative measure of predictive accuracy, but in helping calibrate decision-making to the reliability of AI-supported processes that increasingly shape everyday life.
AIJan 14
A Scoping Review of the Ethical Perspectives on Anthropomorphising Large Language Model-Based Conversational AgentsAndrea Ferrario, Rasita Vinay, Matteo Casserini et al.
Anthropomorphisation -- the phenomenon whereby non-human entities are ascribed human-like qualities -- has become increasingly salient with the rise of large language model (LLM)-based conversational agents (CAs). Unlike earlier chatbots, LLM-based CAs routinely generate interactional and linguistic cues, such as first-person self-reference, epistemic and affective expressions that empirical work shows can increase engagement. On the other hand, anthropomorphisation raises ethical concerns, including deception, overreliance, and exploitative relationship framing, while some authors argue that anthropomorphic interaction may support autonomy, well-being, and inclusion. Despite increasing interest in the phenomenon, literature remains fragmented across domains and varies substantially in how it defines, operationalizes, and normatively evaluates anthropomorphisation. This scoping review maps ethically oriented work on anthropomorphising LLM-based CAs across five databases and three preprint repositories. We synthesize (1) conceptual foundations, (2) ethical challenges and opportunities, and (3) methodological approaches. We find convergence on attribution-based definitions but substantial divergence in operationalization, a predominantly risk-forward normative framing, and limited empirical work that links observed interaction effects to actionable governance guidance. We conclude with a research agenda and design/governance recommendations for ethically deploying anthropomorphic cues in LLM-based conversational agents.
AIOct 9, 2020
A Series of Unfortunate Counterfactual Events: the Role of Time in Counterfactual ExplanationsAndrea Ferrario, Michele Loi
Counterfactual explanations are a prominent example of post-hoc interpretability methods in the explainable Artificial Intelligence research domain. They provide individuals with alternative scenarios and a set of recommendations to achieve a sought-after machine learning model outcome. Recently, the literature has identified desiderata of counterfactual explanations, such as feasibility, actionability and sparsity that should support their applicability in real-world contexts. However, we show that the literature has neglected the problem of the time dependency of counterfactual explanations. We argue that, due to their time dependency and because of the provision of recommendations, even feasible, actionable and sparse counterfactual explanations may not be appropriate in real-world applications. This is due to the possible emergence of what we call "unfortunate counterfactual events." These events may occur due to the retraining of machine learning models whose outcomes have to be explained via counterfactual explanation. Series of unfortunate counterfactual events frustrate the efforts of those individuals who successfully implemented the recommendations of counterfactual explanations. This negatively affects people's trust in the ability of institutions to provide machine learning-supported decisions consistently. We introduce an approach to address the problem of the emergence of unfortunate counterfactual events that makes use of histories of counterfactual explanations. In the final part of the paper we propose an ethical analysis of two distinct strategies to cope with the challenge of unfortunate counterfactual events. We show that they respond to an ethically responsible imperative to preserve the trustworthiness of credit lending organizations, the decision models they employ, and the social-economic function of credit lending.