David Nader Palacio

SE
h-index7
5papers
435citations
Novelty26%
AI Score41

5 Papers

SEMay 27
Rethinking Software Empirical Studies with Structural Causal Models

Daniel Rodriguez-Cardenas, Aya Garryyeva, David Nader Palacio et al.

Causal Inference offers a fundamental approach for advancing empirical software engineering (ESE) beyond traditional statistical association, enabling researchers to rigorously identify and quantify causal relationships in software experiments. This paper introduces CausalSE, a framework that operationalizes Judea Pearl's causal inference paradigm in ESE context. The paper focuses on Structural Causal Models (SCMs) to address the limitations of classical statistical methods in mitigating confounding bias. Through a case study using the Galeras dataset and propensity score matching, we demonstrate how CausalSE disentangles the effect of prompt engineering strategies on code generation outcomes in a popular LLM (i.e., GPT-3). The results reveal that while associational analyses can suggest improvements in certain interventions (e.g., more complex prompts), causal analysis often does not find a significant treatment effect, highlighting the risk of false positives when confounding is not addressed. By providing a tutorial-based methodology and a real-world case study, this work equips software researchers with practical tools to design, analyze, and interpret software experiments with methodological rigor, ultimately enabling more informed and actionable conclusions in both research and practice.

SEApr 15
Towards Enabling An Artificial Self-Construction Software Life-cycle via Autopoietic Architectures

Daniel Rodriguez-Cardenas, David Nader Palacio, Denys Poshyvanyk

Software engineering research has focused on automating maintenance and evolution processes to reduce costs and improve reliability. The emergence of foundation models (FMs) with strong code understanding and reasoning abilities offers new opportunities for autonomous software behavior. Inspired by Artificial Life (ALife), we propose a fundamental shift in the Software Development Life-Cycle (SDLC) by introducing self-construction mechanisms that enable software to evolve and maintain autonomously. This position paper explores the potential of Autopoietic Architectures, specifically Psi-Arch, as a foundational framework for self-constructing software. We first analyze the limitations of traditional maintenance approaches and identify gaps in current SDLC automation. Subsequently, we outline the core challenges in achieving self-construction, including the integration of foundation-model-based reasoning units and the establishment of novel architectural paradigms. Although this paper does not present a definitive solution, it seeks to catalyze discourse and inspire research toward a new paradigm in software engineering, one in which self-constructing software represents the next frontier in SDLC automation.

SENov 28, 2024
Perspective of Software Engineering Researchers on Machine Learning Practices Regarding Research, Review, and Education

Anamaria Mojica-Hanke, David Nader Palacio, Denys Poshyvanyk et al.

Context: Machine Learning (ML) significantly impacts Software Engineering (SE), but studies mainly focus on practitioners, neglecting researchers. This overlooks practices and challenges in teaching, researching, or reviewing ML applications in SE. Objective: This study aims to contribute to the knowledge, about the synergy between ML and SE from the perspective of SE researchers, by providing insights into the practices followed when researching, teaching, and reviewing SE studies that apply ML. Method: We analyzed SE researchers familiar with ML or who authored SE articles using ML, along with the articles themselves. We examined practices, SE tasks addressed with ML, challenges faced, and reviewers' and educators' perspectives using grounded theory coding and qualitative analysis. Results: We found diverse practices focusing on data collection, model training, and evaluation. Some recommended practices (e.g., hyperparameter tuning) appeared in less than 20\% of literature. Common challenges involve data handling, model evaluation (incl. non-functional properties), and involving human expertise in evaluation. Hands-on activities are common in education, though traditional methods persist. Conclusion: Despite accepted practices in applying ML to SE, significant gaps remain. By enhancing guidelines, adopting diverse teaching methods, and emphasizing underrepresented practices, the SE community can bridge these gaps and advance the field.

SEFeb 3, 2021
Studying the Usage of Text-To-Text Transfer Transformer to Support Code-Related Tasks

Antonio Mastropaolo, Simone Scalabrino, Nathan Cooper et al.

Deep learning (DL) techniques are gaining more and more attention in the software engineering community. They have been used to support several code-related tasks, such as automatic bug fixing and code comments generation. Recent studies in the Natural Language Processing (NLP) field have shown that the Text-To-Text Transfer Transformer (T5) architecture can achieve state-of-the-art performance for a variety of NLP tasks. The basic idea behind T5 is to first pre-train a model on a large and generic dataset using a self-supervised task ( e.g: filling masked words in sentences). Once the model is pre-trained, it is fine-tuned on smaller and specialized datasets, each one related to a specific task ( e.g: language translation, sentence classification). In this paper, we empirically investigate how the T5 model performs when pre-trained and fine-tuned to support code-related tasks. We pre-train a T5 model on a dataset composed of natural language English text and source code. Then, we fine-tune such a model by reusing datasets used in four previous works that used DL techniques to: (i) fix bugs, (ii) inject code mutants, (iii) generate assert statements, and (iv) generate code comments. We compared the performance of this single model with the results reported in the four original papers proposing DL-based solutions for those four tasks. We show that our T5 model, exploiting additional data for the self-supervised pre-training phase, can achieve performance improvements over the four baselines.

SESep 14, 2020
A Systematic Literature Review on the Use of Deep Learning in Software Engineering Research

Cody Watson, Nathan Cooper, David Nader Palacio et al.

An increasingly popular set of techniques adopted by software engineering (SE) researchers to automate development tasks are those rooted in the concept of Deep Learning (DL). The popularity of such techniques largely stems from their automated feature engineering capabilities, which aid in modeling software artifacts. However, due to the rapid pace at which DL techniques have been adopted, it is difficult to distill the current successes, failures, and opportunities of the current research landscape. In an effort to bring clarity to this crosscutting area of work, from its modern inception to the present, this paper presents a systematic literature review of research at the intersection of SE & DL. The review canvases work appearing in the most prominent SE and DL conferences and journals and spans 128 papers across 23 unique SE tasks. We center our analysis around the components of learning, a set of principles that govern the application of machine learning techniques (ML) to a given problem domain, discussing several aspects of the surveyed work at a granular level. The end result of our analysis is a research roadmap that both delineates the foundations of DL techniques applied to SE research, and highlights likely areas of fertile exploration for the future.