AIMay 29Code
LLM-FACETS: A Privacy-Preserving Framework for Evaluating LLM Transparency and AccountabilityTom Lucas, Alessio Buscemi, Alfredo Capozucca et al.
Assessing whether Large Language Models outputs are factually grounded, epistemically calibrated, and methodologically reproducible is a prerequisite for responsible AI deployment. Yet auditing LLMs remains inaccessible to non-technical practitioners: existing tools require programming expertise and non-trivial environment setup, and cloud-hosted platforms transmit evaluation data to external services, creating barriers for domain experts and compliance officers legally responsible for AI oversight. We introduce LLM-FACETS (LLM FActuality Cross-EvaluaTion System): an open-source framework with a browser-accessible interface and a plugin architecture, structured around three practitioner profiles (technical experts, domain experts, compliance officers) that mirror the stakeholder categories identified in the EU AI Act and the NIST AI Risk Management Framework. The architecture makes data flows explicit: deterministic metrics (BLEU, ROUGE, BERTScore) run entirely within the self-hosted server with no outbound transmission; LLM-judge metrics contact external APIs explicitly, with users retaining full credential control. The framework operationalizes transparency through three mechanisms: token-level log-probability visualization for epistemic uncertainty, multi-judge consensus to mitigate judge bias, and RAG Triad metrics (Faithfulness, Answer Relevance, Context Relevance) to detect and localize hallucinations. A plugin architecture allows any new metric or dataset to be integrated without modifying the evaluation pipeline. The open-source implementation enables cross-checking across multiple metrics targeting the same property, ensuring reproducibility and decoupling AI accountability from the teams building the systems assessed. We verify the framework through cross-validation of 18 metric implementations against canonical reference libraries.
SEMay 4
A Low-Code Approach for the Automatic Personalization of Conversational AgentsAaron Conrardy, Alfredo Capozucca, Jordi Cabot
In this paper, we conducted an SLR on the state of user modeling in the MDE domain. Results show a diverse set of disconnected proposals, covering a partial number of dimensions with an emphasis on those characteristics that are easier to profile. Moreover, most dimensions are regarded as fixed instead of allowing their dynamic evolution during the interaction with the software application. It is also worth noting that tool support is also rather limited, mostly limited to enabling the creation of the user models itself. The roadmap we hope to see in this area stems from the discussion points seen above. For instance, we believe the community should agree on a unified and re-usable user model, covering the superset of all dimensions present in the literature. Plus additional ones we could learn from user profiling in other domains (e.g. sociology). On the technical side, we expect to see a new generation of ML-based proposals to automatically and incrementally derive a user profile from the analysis of user interactions and a number of automatic pipelines able to transform the user information in concrete application adaptations that personalize the application to cater to the user's needs and profile.
LOMar 12
{log}: From a Constraint Logic Programming Language to a Formal Verification ToolMaximiliano Cristiá, Alfredo Capozucca, Gianfranco Rossi
{log} (read 'setlog') was born as a Constraint Logic Programming (CLP) language where sets and binary relations are first-class citizens, thus fostering set programming. Internally, {log} is a constraint satisfiability solver implementing decision procedures for several fragments of set theory. Hence, {log} can be used as a declarative, set, logic programming language and as an automated theorem prover for set theory. Over time {log} has been extended with some components integrated to the satisfiability solver thus providing a formal verification environment. In this paper we make a comprehensive presentation of this environment which includes a language for the description of state machines based on set theory, an interactive environment for the execution of functional scenarios over state machines, a generator of verification conditions for state machines, automated verification of state machines, and test case generation. State machines are both, programs and specifications; exactly the same code works as a program and as its specification. In this way, with a few additions, a CLP language turned into a seamlessly integrated programming and automated proof system.
CYJan 20, 2025
Do AI assistants help students write formal specifications? A study with ChatGPT and the B-MethodAlfredo Capozucca, Daniil Yampolskyi, Alexander Goldberg et al.
This paper investigates the role of AI assistants, specifically OpenAI's ChatGPT, in teaching formal methods (FM) to undergraduate students, using the B-method as a formal specification technique. While existing studies demonstrate the effectiveness of AI in coding tasks, no study reports on its impact on formal specifications. We examine whether ChatGPT provides an advantage when writing B-specifications and analyse student trust in its outputs. Our findings indicate that the AI does not help students to enhance the correctness of their specifications, with low trust correlating to better outcomes. Additionally, we identify a behavioural pattern with which to interact with ChatGPT which may influence the correctness of B-specifications.
SEApr 4, 2019
DevOps and its Philosophy : Education Matters!Evgeny Bobrov, Antonio Bucchiarone, Alfredo Capozucca et al.
DevOps processes comply with principles and offer practices with main objective to support efficiently the evolution of IT systems. To be efficient a DevOps process relies on a set of integrated tools. DevOps is the first required competency together with Agile Method required by the industry. DevOps processes are sharing many aspects with microservices approaches especially the modularity and flexibility which enables continuous change and delivery. As a new approach it is necessary to developp and offer to the academy and to the industry training programs to prepare our engineers in the best possible way. In this chapter we present the main aspects of the educational effort made in the recent years to educate to the concepts and values of the DevOps philosophy. This includes principles, practices, tools and architectures, primarily the Microservice architectural style. Two experiences have been made, one at academic level as a master program course and the other, as an industrial training. Based on those two experiences, we provide a comparative analysis and some proposals in order to develop and improve DevOps education for the future.
SEMar 18, 2019
Teaching DevOps in academia and industry: reflections and visionEvgeny Bobrov, Antonio Bucchiarone, Alfredo Capozucca et al.
This paper describes our experience of delivery educational programs in academia and in industry on DevOps, compare the two approaches and sum-up the lessons learnt. We also propose a vision to implement a shift in the Software Engineering Higher Education curricula.