Damian Andrew Tamburri

SE
h-index13
6papers
134citations
Novelty33%
AI Score44

6 Papers

77.3SEJun 3
A Taxonomy of Runtime Faults in Model Context Protocol Servers

Joshua Owotogbe, Indika Kumara, Willem-Jan van den Heuvel et al.

MCP (Model Context Protocol) enables LLMs (Large Language Models) to interact with external tools and data sources via a standardized protocol. Its rapid adoption in tool-augmented Artificial Intelligence (AI) workflows has introduced new reliability challenges, such as configuration parameters that are accepted but not enforced at runtime, leading to unintended default behavior, whose runtime fault characteristics remain empirically unexamined. We present the first empirical taxonomy of runtime faults in MCP servers. We manually analyzed 837 MCP-specific runtime fault threads from 473 actively maintained MCP server GitHub repositories and derived a taxonomy using a bottom-up open coding procedure. The taxonomy comprises 11 top-level categories and 27 subcategories (73 leaf fault types), covering recurrent failures across protocol interactions, tool invocations, schema enforcement, state management, model-provider integration, security validation, and timeouts or explicit cancellations of in-progress operations. To assess the taxonomy's external validity, we surveyed 55 MCP server developers. Respondents reported experiencing an average of 20 of the 27 fault subcategories, and no category remained unobserved. These results indicate that the taxonomy reflects widely observed runtime failures in MCP-based systems and shall assist AI software maintenance and evolution in the future.

26.3SEMay 8
"Show Me You Comply... Without Showing Me Anything": Zero-Knowledge Software Auditing for AI-Enabled Systems

Filippo Scaramuzza, Renato Cordeiro Ferreira, Giovanni Quattrocchi et al.

Classical software verification and validation techniques, such as procedural audits, formal methods, or model documentation, are the traditional mechanisms used to achieve the verifiable accountability now required by regulations like the EU AI Act. These methods are either expensive or heavily manual, and ill-suited for the opaque, "black box" nature of most Artificial Intelligence (AI) models. A conflict arises: high auditability and verifiability are required by law, but such transparency conflicts with the need to protect the assets being audited (e.g., confidential data and proprietary models). This paper introduces ZKMLOps, an \ac{MLOps} verification framework that operationalizes Zero-Knowledge Proofs (ZKPs) within Machine-Learning Operations lifecycles; a ZKP allows a prover to convince a verifier that a statement is true without revealing any information about the statement itself. By integrating ZKP with established software engineering patterns, ZKMLOps provides a modular and repeatable process for generating verifiable cryptographic evidence-proofs of well-defined computational statements about the audited model and its inputs-that auditors can use as input to a regulatory compliance determination. We evaluate the framework along two dimensions. First, framework viability: orchestration overhead is bounded and stable across architecturally heterogeneous ZKP backends and models of increasing size. Second, cost-versus-assurance trade-offs: the audit-on-demand setting is the regime in which full zero-knowledge auditing is the appropriate tool, where it provides confidentiality and integrity guarantees that lighter-weight alternatives cannot match.

AIDec 16, 2025
IaC Generation with LLMs: An Error Taxonomy and A Study on Configuration Knowledge Injection

Roman Nekrasov, Stefano Fossati, Indika Kumara et al.

Large Language Models (LLMs) currently exhibit low success rates in generating correct and intent-aligned Infrastructure as Code (IaC). This research investigated methods to improve LLM-based IaC generation, specifically for Terraform, by systematically injecting structured configuration knowledge. To facilitate this, an existing IaC-Eval benchmark was significantly enhanced with cloud emulation and automated error analysis. Additionally, a novel error taxonomy for LLM-assisted IaC code generation was developed. A series of knowledge injection techniques was implemented and evaluated, progressing from Naive Retrieval-Augmented Generation (RAG) to more sophisticated Graph RAG approaches. These included semantic enrichment of graph components and modeling inter-resource dependencies. Experimental results demonstrated that while baseline LLM performance was poor (27.1% overall success), injecting structured configuration knowledge increased technical validation success to 75.3% and overall success to 62.6%. Despite these gains in technical correctness, intent alignment plateaued, revealing a "Correctness-Congruence Gap" where LLMs can become proficient "coders" but remain limited "architects" in fulfilling nuanced user intent.

AIApr 5, 2021
DataOps for Societal Intelligence: a Data Pipeline for Labor Market Skills Extraction and Matching

Damian Andrew Tamburri, Willem-Jan Van den Heuvel, Martin Garriga

Big Data analytics supported by AI algorithms can support skills localization and retrieval in the context of a labor market intelligence problem. We formulate and solve this problem through specific DataOps models, blending data sources from administrative and technical partners in several countries into cooperation, creating shared knowledge to support policy and decision-making. We then focus on the critical task of skills extraction from resumes and vacancies featuring state-of-the-art machine learning models. We showcase preliminary results with applied machine learning on real data from the employment agencies of the Netherlands and the Flemish region in Belgium. The final goal is to match these skills to standard ontologies of skills, jobs and occupations.

CRJul 23, 2020
Blockchain and Cryptocurrencies: a Classification and Comparison of Architecture Drivers

Martin Garriga, Stefano Dalla Palma, Maximiliano Arias et al.

Blockchain is a decentralized transaction and data management solution, the technological leap behind the success of Bitcoin and other cryptocurrencies. As the variety of existing blockchains and distributed ledgers continues to increase, adopters should focus on selecting the solution that best fits their needs and the requirements of their decentralized applications, rather than developing yet another blockchain from scratch. In this paper we present a conceptual framework to aid software architects, developers, and decision makers to adopt the right blockchain technology. The framework exposes the interrelation between technological decisions and architectural features, capturing the knowledge from existing academic literature, industrial products, technical forums/blogs, and experts' feedback. We empirically show the applicability of our framework by dissecting the platforms behind Bitcoin and other top 10 cryptocurrencies, aided by a focus group with researchers and industry practitioners. Then, we leverage the framework together with key notions of the Architectural Tradeoff Analysis Method (ATAM) to analyze four real-world blockchain case studies from industry and academia. Results shown that applying our framework leads to a deeper understanding of the architectural tradeoffs, allowing to assess technologies more objectively and select the one that best fit developers needs, ultimately cutting costs, reducing time-to-market and accelerating return on investment.

SEAug 2, 2019
Towards Surgically-Precise Technical Debt Estimation: Early Results and Research Roadmap

Valentina Lenarduzzi, Antonio Martini, Davide Taibi et al.

The concept of technical debt has been explored from many perspectives but its precise estimation is still under heavy empirical and experimental inquiry. We aim to understand whether, by harnessing approximate, data-driven, machine-learning approaches it is possible to improve the current techniques for technical debt estimation, as represented by a top industry quality analysis tool such as SonarQube. For the sake of simplicity, we focus on relatively simple regression modelling techniques and apply them to modelling the additional project cost connected to the sub-optimal conditions existing in the projects under study. Our results shows that current techniques can be improved towards a more precise estimation of technical debt and the case study shows promising results towards the identification of more accurate estimation of technical debt.