SENov 14, 2025Code
SQuaD: The Software Quality DatasetMikel Robredo, Matteo Esposito, Davide Taibi et al.
Software quality research increasingly relies on large-scale datasets that measure both the product and process aspects of software systems. However, existing resources often focus on limited dimensions, such as code smells, technical debt, or refactoring activity, thereby restricting comprehensive analyses across time and quality dimensions. To address this gap, we present the Software Quality Dataset (SQuaD), a multi-dimensional, time-aware collection of software quality metrics extracted from 450 mature open-source projects across diverse ecosystems, including Apache, Mozilla, FFmpeg, and the Linux kernel. By integrating nine state-of-the-art static analysis tools, i.e., SonarQube, CodeScene, PMD, Understand, CK, JaSoMe, RefactoringMiner, RefactoringMiner++, and PyRef, our dataset unifies over 700 unique metrics at method, class, file, and project levels. Covering a total of 63,586 analyzed project releases, SQuaD also provides version control and issue-tracking histories, software vulnerability data (CVE/CWE), and process metrics proven to enhance Just-In-Time (JIT) defect prediction. The SQuaD enables empirical research on maintainability, technical debt, software evolution, and quality assessment at unprecedented scale. We also outline emerging research directions, including automated dataset updates and cross-project quality modeling to support the continuous evolution of software analytics. The dataset is publicly available on ZENODO (DOI: 10.5281/zenodo.17566690).
CVJul 19, 2024
How to Blend Concepts in Diffusion ModelsLorenzo Olearo, Giorgio Longari, Simone Melzi et al.
For the last decade, there has been a push to use multi-dimensional (latent) spaces to represent concepts; and yet how to manipulate these concepts or reason with them remains largely unclear. Some recent methods exploit multiple latent representations and their connection, making this research question even more entangled. Our goal is to understand how operations in the latent space affect the underlying concepts. To that end, we explore the task of concept blending through diffusion models. Diffusion models are based on a connection between a latent representation of textual prompts and a latent space that enables image reconstruction and generation. This task allows us to try different text-based combination strategies, and evaluate easily through a visual analysis. Our conclusion is that concept blending through space manipulation is possible, although the best strategy depends on the context of the blend.
SEJan 5
A Defect is Being Born: How Close Are We? A Time Sensitive Forecasting ApproachMikel Robredo, Matteo Esposito, Fabio Palomba et al.
Background. Defect prediction has been a highly active topic among researchers in the Empirical Software Engineering field. Previous literature has successfully achieved the most accurate prediction of an incoming fault and identified the features and anomalies that precede it through just-in-time prediction. As software systems evolve continuously, there is a growing need for time-sensitive methods capable of forecasting defects before they manifest. Aim. Our study seeks to explore the effectiveness of time-sensitive techniques for defect forecasting. Moreover, we aim to investigate the early indicators that precede the occurrence of a defect. Method. We will train multiple time-sensitive forecasting techniques to forecast the future bug density of a software project, as well as identify the early symptoms preceding the occurrence of a defect. Expected results. Our expected results are translated into empirical evidence on the effectiveness of our approach for early estimation of bug proneness.
AISep 14, 2024
Enumerating Minimal Unsatisfiable Cores of LTLf formulasAntonio Ielo, Giuseppe Mazzotta, Rafael Peñaloza et al.
Linear Temporal Logic over finite traces ($\text{LTL}_f$) is a widely used formalism with applications in AI, process mining, model checking, and more. The primary reasoning task for $\text{LTL}_f$ is satisfiability checking; yet, the recent focus on explainable AI has increased interest in analyzing inconsistent formulas, making the enumeration of minimal explanations for infeasibility a relevant task also for $\text{LTL}_f$. This paper introduces a novel technique for enumerating minimal unsatisfiable cores (MUCs) of an $\text{LTL}_f$ specification. The main idea is to encode a $\text{LTL}_f$ formula into an Answer Set Programming (ASP) specification, such that the minimal unsatisfiable subsets (MUSes) of the ASP program directly correspond to the MUCs of the original $\text{LTL}_f$ specification. Leveraging recent advancements in ASP solving yields a MUC enumerator achieving good performance in experiments conducted on established benchmarks from the literature.
LOOct 25, 2023
Semiring Provenance for Lightweight Description LogicsCamille Bourgaux, Ana Ozaki, Rafael Peñaloza
We investigate semiring provenance--a successful framework originally defined in the relational database setting--for description logics. In this context, the ontology axioms are annotated with elements of a commutative semiring and these annotations are propagated to the ontology consequences in a way that reflects how they are derived. We define a provenance semantics for a language that encompasses several lightweight description logics and show its relationships with semantics that have been defined for ontologies annotated with a specific kind of annotation (such as fuzzy degrees). We show that under some restrictions on the semiring, the semantics satisfies desirable properties (such as extending the semiring provenance defined for databases). We then focus on the well-known why-provenance, for which we study the complexity of problems related to the provenance of an assertion or a conjunctive query answer. Finally, we consider two more restricted cases which correspond to the so-called positive Boolean provenance and lineage in the database setting. For these cases, we exhibit relationships with well-known notions related to explanations in description logics and complete our complexity analysis. As a side contribution, we provide conditions on an $\mathcal{ELHI}_\bot$ ontology that guarantee tractable reasoning.
SEJan 22, 2025
A Call for Critically Rethinking and Reforming Data Analysis in Empirical Software EngineeringMatteo Esposito, Mikel Robredo, Murali Sridharan et al.
Context: Empirical Software Engineering (ESE) drives innovation in SE through qualitative and quantitative studies. However, concerns about the correct application of empirical methodologies have existed since the 2006 Dagstuhl seminar on SE. Objective: To analyze three decades of SE research, identify mistakes in statistical methods, and evaluate experts' ability to detect and address these issues. Methods: We conducted a literature survey of ~27,000 empirical studies, using LLMs to classify statistical methodologies as adequate or inadequate. Additionally, we selected 30 primary studies and held a workshop with 33 ESE experts to assess their ability to identify and resolve statistical issues. Results: Significant statistical issues were found in the primary studies, and experts showed limited ability to detect and correct these methodological problems, raising concerns about the broader ESE community's proficiency in this area. Conclusions. Despite our study's eventual limitations, its results shed light on recurring issues from promoting information copy-and-paste from past authors' works and the continuous publication of inadequate approaches that promote dubious results and jeopardize the spread of the correct statistical strategies among researchers. Besides, it justifies further investigation into empirical rigor in software engineering to expose these recurring issues and establish a framework for reassessing our field's foundation of statistical methodology application. Therefore, this work calls for critically rethinking and reforming data analysis in empirical software engineering, paving the way for our work soon.
AIOct 27, 2025
When No Paths Lead to Rome: Benchmarking Systematic Neural Relational ReasoningAnirban Das, Irtaza Khalid, Rafael Peñaloza et al.
Designing models that can learn to reason in a systematic way is an important and long-standing challenge. In recent years, a wide range of solutions have been proposed for the specific case of systematic relational reasoning, including Neuro-Symbolic approaches, variants of the Transformer architecture, and specialised Graph Neural Networks. However, existing benchmarks for systematic relational reasoning focus on an overly simplified setting, based on the assumption that reasoning can be reduced to composing relational paths. In fact, this assumption is hard-baked into the architecture of several recent models, leading to approaches that can perform well on existing benchmarks but are difficult to generalise to other settings. To support further progress in the field of systematic relational reasoning with neural networks, we introduce NoRA, a new benchmark which adds several levels of difficulty and requires models to go beyond path-based reasoning.
AIOct 16, 2025
NAEL: Non-Anthropocentric Ethical LogicBianca Maria Lerma, Rafael Peñaloza
We introduce NAEL (Non-Anthropocentric Ethical Logic), a novel ethical framework for artificial agents grounded in active inference and symbolic reasoning. Departing from conventional, human-centred approaches to AI ethics, NAEL formalizes ethical behaviour as an emergent property of intelligent systems minimizing global expected free energy in dynamic, multi-agent environments. We propose a neuro-symbolic architecture to allow agents to evaluate the ethical consequences of their actions in uncertain settings. The proposed system addresses the limitations of existing ethical models by allowing agents to develop context-sensitive, adaptive, and relational ethical behaviour without presupposing anthropomorphic moral intuitions. A case study involving ethical resource distribution illustrates NAEL's dynamic balancing of self-preservation, epistemic learning, and collective welfare.
SESep 9, 2025
What Were You Thinking? An LLM-Driven Large-Scale Study of Refactoring Motivations in Open-Source ProjectsMikel Robredo, Matteo Esposito, Fabio Palomba et al.
Context. Code refactoring improves software quality without changing external behavior. Despite its advantages, its benefits are hindered by the considerable cost of time, resources, and continuous effort it demands. Aim. Understanding why developers refactor, and which metrics capture these motivations, may support wider and more effective use of refactoring in practice. Method. We performed a large-scale empirical study to analyze developers refactoring activity, leveraging Large Language Models (LLMs) to identify underlying motivations from version control data, comparing our findings with previous motivations reported in the literature. Results. LLMs matched human judgment in 80% of cases, but aligned with literature-based motivations in only 47%. They enriched 22% of motivations with more detailed rationale, often highlighting readability, clarity, and structural improvements. Most motivations were pragmatic, focused on simplification and maintainability. While metrics related to developer experience and code readability ranked highest, their correlation with motivation categories was weak. Conclusions. We conclude that LLMs effectively capture surface-level motivations but struggle with architectural reasoning. Their value lies in providing localized explanations, which, when combined with software metrics, can form hybrid approaches. Such integration offers a promising path toward prioritizing refactoring more systematically and balancing short-term improvements with long-term architectural goals.
CVJun 30, 2025
Blending Concepts with Text-to-Image Diffusion ModelsLorenzo Olearo, Giorgio Longari, Alessandro Raganato et al.
Diffusion models have dramatically advanced text-to-image generation in recent years, translating abstract concepts into high-fidelity images with remarkable ease. In this work, we examine whether they can also blend distinct concepts, ranging from concrete objects to intangible ideas, into coherent new visual entities under a zero-shot framework. Specifically, concept blending merges the key attributes of multiple concepts (expressed as textual prompts) into a single, novel image that captures the essence of each concept. We investigate four blending methods, each exploiting different aspects of the diffusion pipeline (e.g., prompt scheduling, embedding interpolation, or layer-wise conditioning). Through systematic experimentation across diverse concept categories, such as merging concrete concepts, synthesizing compound words, transferring artistic styles, and blending architectural landmarks, we show that modern diffusion models indeed exhibit creative blending capabilities without further training or fine-tuning. Our extensive user study, involving 100 participants, reveals that no single approach dominates in all scenarios: each blending technique excels under certain conditions, with factors like prompt ordering, conceptual distance, and random seed affecting the outcome. These findings highlight the remarkable compositional potential of diffusion models while exposing their sensitivity to seemingly minor input variations.
CLMay 1, 2025
Reasoning Capabilities and Invariability of Large Language ModelsAlessandro Raganato, Rafael Peñaloza, Marco Viviani et al.
Large Language Models (LLMs) have shown remarkable capabilities in manipulating natural language across multiple applications, but their ability to handle simple reasoning tasks is often questioned. In this work, we aim to provide a comprehensive analysis of LLMs' reasoning competence, specifically focusing on their prompt dependency. In particular, we introduce a new benchmark dataset with a series of simple reasoning questions demanding shallow logical reasoning. Aligned with cognitive psychology standards, the questions are confined to a basic domain revolving around geometric figures, ensuring that responses are independent of any pre-existing intuition about the world and rely solely on deduction. An empirical analysis involving zero-shot and few-shot prompting across 24 LLMs of different sizes reveals that, while LLMs with over 70 billion parameters perform better in the zero-shot setting, there is still a large room for improvement. An additional test with chain-of-thought prompting over 22 LLMs shows that this additional prompt can aid or damage the performance of models, depending on whether the rationale is required before or after the answer.
AINov 23, 2021
Answering Fuzzy Queries over Fuzzy DL-Lite OntologiesGabriella Pasi, Rafael Peñaloza
A prominent problem in knowledge representation is how to answer queries taking into account also the implicit consequences of an ontology representing domain knowledge. While this problem has been widely studied within the realm of description logic ontologies, it has been surprisingly neglected within the context of vague or imprecise knowledge, particularly from the point of view of mathematical fuzzy logic. In this paper we study the problem of answering conjunctive queries and threshold queries w.r.t. ontologies in fuzzy DL-Lite. Specifically, we show through a rewriting approach that threshold query answering w.r.t. consistent ontologies remains in $AC_0$ in data complexity, but that conjunctive query answering is highly dependent on the selected triangular norm, which has an impact on the underlying semantics. For the idempodent Gödel t-norm, we provide an effective method based on a reduction to the classical case. This paper is under consideration in Theory and Practice of Logic Programming (TPLP).
LOSep 23, 2021
Union and Intersection of all JustificationsJieying Chen, Yue Ma, Rafael Peñaloza et al.
We present new algorithm for computing the union and intersection of all justifications for a given ontological consequence without first computing the set of all justifications. Through an empirical evaluation, we show that our approach works well in practice for expressive DLs. In particular, the union of all justifications can be computed much faster than with existing justification-enumeration approaches. We further discuss how to use these results to repair ontologies efficiently.
LOJul 1, 2020
Reasoning with Contextual Knowledge and Influence DiagramsErman Acar, Rafael Peñaloza
Influence diagrams (IDs) are well-known formalisms extending Bayesian networks to model decision situations under uncertainty. Although they are convenient as a decision theoretic tool, their knowledge representation ability is limited in capturing other crucial notions such as logical consistency. We complement IDs with the light-weight description logic (DL) EL to overcome such limitations. We consider a setup where DL axioms hold in some contexts, yet the actual context is uncertain. The framework benefits from the convenience of using DL as a domain knowledge representation language and the modelling strength of IDs to deal with decisions over contexts in the presence of contextual uncertainty. We define related reasoning problems and study their computational complexity.
LOMar 18, 2020
Axiom PinpointingRafael Peñaloza
Axiom pinpointing refers to the task of finding the specific axioms in an ontology which are responsible for a consequence to follow. This task has been studied, under different names, in many research areas, leading to a reformulation and reinvention of techniques. In this work, we present a general overview to axiom pinpointing, providing the basic notions, different approaches for solving it, and some variations and applications which have been considered in the literature. This should serve as a starting point for researchers interested in related problems, with an ample bibliography for delving deeper into the details.
LOJan 21, 2020
Provenance for the Description Logic ELHrCamille Bourgaux, Ana Ozaki, Rafael Peñaloza et al.
We address the problem of handling provenance information in ELHr ontologies. We consider a setting recently introduced for ontology-based data access, based on semirings and extending classical data provenance, in which ontology axioms are annotated with provenance tokens. A consequence inherits the provenance of the axioms involved in deriving it, yielding a provenance polynomial as an annotation. We analyse the semantics for the ELHr case and show that the presence of conjunctions poses various difficulties for handling provenance, some of which are mitigated by assuming multiplicative idempotency of the semiring. Under this assumption, we study three problems: ontology completion with provenance, computing the set of relevant axioms for a consequence, and query answering.
LOMar 12, 2019
Temporal Logics Over Finite Traces with Uncertainty (Technical Report)Fabrizio M. Maggi, Marco Montali, Rafael Peñaloza
Temporal logics over finite traces have recently seen wide application in a number of areas, from business process modelling, monitoring, and mining to planning and decision making. However, real-life dynamic systems contain a degree of uncertainty which cannot be handled with classical logics. We thus propose a new probabilistic temporal logic over finite traces using superposition semantics, where all possible evolutions are possible, until observed. We study the properties of the logic and provide automata-based mechanisms for deriving probabilistic inferences from its formulas. We then study a fragment of the logic with better computational properties. Notably, formulas in this fragment can be discovered from event log data using off-the-shelf existing declarative process discovery techniques.
AIJul 25, 2017
A Decidable Very Expressive Description Logic for Databases (Extended Version)Alessandro Artale, Enrico Franconi, Rafael Peñaloza et al.
We introduce $\mathcal{DLR}^+$, an extension of the n-ary propositionally closed description logic $\mathcal{DLR}$ to deal with attribute-labelled tuples (generalising the positional notation), projections of relations, and global and local objectification of relations, able to express inclusion, functional, key, and external uniqueness dependencies. The logic is equipped with both TBox and ABox axioms. We show how a simple syntactic restriction on the appearance of projections sharing common attributes in a $\mathcal{DLR}^+$ knowledge base makes reasoning in the language decidable with the same computational complexity as $\mathcal{DLR}$. The obtained $\mathcal{DLR}^\pm$ n-ary description logic is able to encode more thoroughly conceptual data models such as EER, UML, and ORM.
AIJun 10, 2017
Towards Statistical Reasoning in Description Logics over Finite Domains (Full Version)Rafael Peñaloza, Nico Potyka
We present a probabilistic extension of the description logic $\mathcal{ALC}$ for reasoning about statistical knowledge. We consider conditional statements over proportions of the domain and are interested in the probabilistic-logical consequences of these proportions. After introducing some general reasoning problems and analyzing their properties, we present first algorithms and complexity results for reasoning in some fragments of Statistical $\mathcal{ALC}$.
AIJun 30, 2016
Probabilistic Reasoning in the Description Logic ALCP with the Principle of Maximum Entropy (Full Version)Rafael Peñaloza, Nico Potyka
A central question for knowledge representation is how to encode and handle uncertain knowledge adequately. We introduce the probabilistic description logic ALCP that is designed for representing context-dependent knowledge, where the actual context taking place is uncertain. ALCP allows the expression of logical dependencies on the domain and probabilistic dependencies on the possible contexts. In order to draw probabilistic conclusions, we employ the principle of maximum entropy. We provide reasoning algorithms for this logic, and show that it satisfies several desirable properties of probabilistic logics.
AISep 29, 2015
Reasoning in Infinitely Valued G-IALCQStefan Borgwardt, Rafael Peñaloza
Fuzzy Description Logics (FDLs) are logic-based formalisms used to represent and reason with vague or imprecise knowledge. It has been recently shown that reasoning in most FDLs using truth values from the interval [0,1] becomes undecidable in the presence of a negation constructor and general concept inclusion axioms. One exception to this negative result are FDLs whose semantics is based on the infinitely valued Gödel t-norm (G). In this paper, we extend previous decidability results for G-IALC to deal also with qualified number restrictions. Our novel approach is based on a combination of the known crispification technique for finitely valued FDLs and the automata-based procedure originally developed for reasoning in G-IALC. The proposed approach combines the advantages of these two methods, while removing their respective drawbacks.
LOAug 11, 2015
Answering Fuzzy Conjunctive Queries over Finitely Valued Fuzzy OntologiesStefan Borgwardt, Theofilos Mailis, Rafael Peñaloza et al.
Fuzzy Description Logics (DLs) provide a means for representing vague knowledge about an application domain. In this paper, we study fuzzy extensions of conjunctive queries (CQs) over the DL $\mathcal{SROIQ}$ based on finite chains of degrees of truth. To answer such queries, we extend a well-known technique that reduces the fuzzy ontology to a classical one, and use classical DL reasoners as a black box. We improve the complexity of previous reduction techniques for finitely valued fuzzy DLs, which allows us to prove tight complexity results for answering certain kinds of fuzzy CQs. We conclude with an experimental evaluation of a prototype implementation, showing the feasibility of our approach.
AIJun 26, 2015
Dynamic Bayesian Ontology Languagesİsmail İlkan Ceylan, Rafael Peñaloza
Many formalisms combining ontology languages with uncertainty, usually in the form of probabilities, have been studied over the years. Most of these formalisms, however, assume that the probabilistic structure of the knowledge remains static over time. We present a general approach for extending ontology languages to handle time-evolving uncertainty represented by a dynamic Bayesian network. We show how reasoning in the original language and dynamic Bayesian inferences can be exploited for effective reasoning in our framework.