Damian A. Tamburri

h-index36

20papers

247citations

Novelty30%

AI Score28

Ranked #150,576 of 194,257 authors (top 78%)#1,742 in SE (top 57%)

20 Papers

2.3MEApr 18, 2023Code

Unveiling and unraveling aggregation and dispersion fallacies in group MCDM

Majid Mohammadi, Damian A. Tamburri, Jafar Rezaei

Priorities in multi-criteria decision-making (MCDM) convey the relevance preference of one criterion over another, which is usually reflected by imposing the non-negativity and unit-sum constraints. The processing of such priorities is different than other unconstrained data, but this point is often neglected by researchers, which results in fallacious statistical analysis. This article studies three prevalent fallacies in group MCDM along with solutions based on compositional data analysis to avoid misusing statistical operations. First, we use a compositional approach to aggregate the priorities of a group of DMs and show that the outcome of the compositional analysis is identical to the normalized geometric mean, meaning that the arithmetic mean should be avoided. Furthermore, a new aggregation method is developed, which is a robust surrogate for the geometric mean. We also discuss the errors in computing measures of dispersion, including standard deviation and distance functions. Discussing the fallacies in computing the standard deviation, we provide a probabilistic criteria ranking by developing proper Bayesian tests, where we calculate the extent to which a criterion is more important than another. Finally, we explain the errors in computing the distance between priorities, and a clustering algorithm is specially tailored based on proper distance metrics.

1.5CVNov 28, 2023

When the Few Outweigh the Many: Illicit Content Recognition with Few-Shot Learning

G. Cascavilla, G. Catolino, M. Conti et al.

The anonymity and untraceability benefits of the Dark web account for the exponentially-increased potential of its popularity while creating a suitable womb for many illicit activities, to date. Hence, in collaboration with cybersecurity and law enforcement agencies, research has provided approaches for recognizing and classifying illicit activities with most exploiting textual dark web markets' content recognition; few such approaches use images that originated from dark web content. This paper investigates this alternative technique for recognizing illegal activities from images. In particular, we investigate label-agnostic learning techniques like One-Shot and Few-Shot learning featuring the use Siamese neural networks, a state-of-the-art approach in the field. Our solution manages to handle small-scale datasets with promising accuracy. In particular, Siamese neural networks reach 90.9% on 20-Shot experiments over a 10-class dataset; this leads us to conclude that such models are a promising and cheaper alternative to the definition of automated law-enforcing machinery over the dark web.

2.9CRApr 1, 2022

Internet-of-Things Architectures for Secure Cyber-Physical Spaces: the VISOR Experience Report

Daniel De Pascale, Giuseppe Cascavilla, Mirella Sangiovanni et al.

Internet of things (IoT) technologies are becoming a more and more widespread part of civilian life in common urban spaces, which are rapidly turning into cyber-physical spaces. Simultaneously, the fear of terrorism and crime in such public spaces is ever-increasing. Due to the resulting increased demand for security, video-based IoT surveillance systems have become an important area for research. Considering the large number of devices involved in the illicit recognition task, we conducted a field study in a Dutch Easter music festival in a national interest project called VISOR to select the most appropriate device configuration in terms of performance and results. We iteratively architected solutions for the security of cyber-physical spaces using IoT devices. We tested the performance of multiple federated devices encompassing drones, closed-circuit television, smart phone cameras, and smart glasses to detect real-case scenarios of potentially malicious activities such as mosh-pits and pick-pocketing. Our results pave the way to select optimal IoT architecture configurations -- i.e., a mix of CCTV, drones, smart glasses, and camera phones in our case -- to make safer cyber-physical spaces' a reality.

3.3NEJul 12, 2024

A Scale-Invariant Diagnostic Approach Towards Understanding Dynamics of Deep Neural Networks

Ambarish Moharil, Damian Tamburri, Indika Kumara et al.

This paper introduces a scale-invariant methodology employing \textit{Fractal Geometry} to analyze and explain the nonlinear dynamics of complex connectionist systems. By leveraging architectural self-similarity in Deep Neural Networks (DNNs), we quantify fractal dimensions and \textit{roughness} to deeply understand their dynamics and enhance the quality of \textit{intrinsic} explanations. Our approach integrates principles from Chaos Theory to improve visualizations of fractal evolution and utilizes a Graph-Based Neural Network for reconstructing network topology. This strategy aims at advancing the \textit{intrinsic} explainability of connectionist Artificial Intelligence (AI) systems.

3.6SENov 23, 2021Code

RepoMiner: a Language-agnostic Python Framework to Mine Software Repositories for Defect Prediction

Stefano Dalla Palma, Dario Di Nucci, Damian Tamburri

Data originating from open-source software projects provide valuable information to enhance software quality. In the scope of Software Defect Prediction, one of the most challenging parts is extracting valid data about failure-prone software components from these repositories, which can help develop more robust software. In particular, collecting data, calculating metrics, and synthesizing results from these repositories is a tedious and error-prone task, which often requires understanding the programming languages involved in the mined repositories, eventually leading to a proliferation of language-specific data-mining software. This paper presents RepoMiner, a language-agnostic tool developed to support software engineering researchers in creating datasets to support any study on defect prediction. RepoMiner automatically collects failure data from software components, labels them as failure-prone or neutral, and calculates metrics to be used as ground truth for defect prediction models. We present its implementation and provide examples of its application.

6.4SEMay 17, 2021Code

In Search of Socio-Technical Congruence: A Large-Scale Longitudinal Study

Wolfgang Mauerer, Mitchell Joblin, Damian A. Tamburri et al.

We report on a large-scale empirical study investigating the relevance of socio-technical congruence over key basic software quality metrics, namely, bugs and churn. In particular, we explore whether alignment or misalignment of social communication structures and technical dependencies in large software projects influences software quality. To this end, we have defined a quantitative and operational notion of socio-technical congruence, which we call socio-technical motif congruence (STMC). STMC is a measure of the degree to which developers working on the same file or on two related files, need to communicate. As socio-technical congruence is a complex and multi-faceted phenomenon, the interpretability of the results is one of our main concerns, so we have employed a careful mixed-methods statistical analysis. In particular, we provide analyses with similar techniques as employed by seminal work in the field to ensure comparability of our results with the existing body of work. The major result of our study, based on an analysis of 25 large open-source projects, is that STMC is not related to project quality measures -- software bugs and churn -- in any temporal scenario. That is, we find no statistical relationship between the alignment of developer tasks and developer communications on the one hand, and project outcomes on the other hand. We conclude that, wherefore congruence does matter as literature shows, then its measurable effect lies elsewhere.

3.6SEMay 7, 2021Code

Detecting Security Fixes in Open-Source Repositories using Static Code Analyzers

Therese Fehrer, Rocío Cabrera Lozoya, Antonino Sabetta et al.

The sources of reliable, code-level information about vulnerabilities that affect open-source software (OSS) are scarce, which hinders a broad adoption of advanced tools that provide code-level detection and assessment of vulnerable OSS dependencies. In this paper, we study the extent to which the output of off-the-shelf static code analyzers can be used as a source of features to represent commits in Machine Learning (ML) applications. In particular, we investigate how such features can be used to construct embeddings and train ML models to automatically identify source code commits that contain vulnerability fixes. We analyze such embeddings for security-relevant and non-security-relevant commits, and we show that, although in isolation they are not different in a statistically significant manner, it is possible to use them to construct a ML pipeline that achieves results comparable with the state of the art. We also found that the combination of our method with commit2vec represents a tangible improvement over the state of the art in the automatic identification of commits that fix vulnerabilities: the ML models we construct and commit2vec are complementary, the former being more generally applicable, albeit not as accurate.

7.3SESep 22, 2020Code

DeepIaC: Deep Learning-Based Linguistic Anti-pattern Detection in IaC

Nemania Borovits, Indika Kumara, Parvathy Krishnan et al.

Linguistic anti-patterns are recurring poor practices concerning inconsistencies among the naming, documentation, and implementation of an entity. They impede readability, understandability, and maintainability of source code. This paper attempts to detect linguistic anti-patterns in infrastructure as code (IaC) scripts used to provision and manage computing environments. In particular, we consider inconsistencies between the logic/body of IaC code units and their names. To this end, we propose a novel automated approach that employs word embeddings and deep learning techniques. We build and use the abstract syntax tree of IaC code units to create their code embedments. Our experiments with a dataset systematically extracted from open source repositories show that our approach yields an accuracy between0.785and0.915in detecting inconsistencies

1.6LGMay 12, 2021Code

An efficient projection neural network for $\ell_1$-regularized logistic regression

Majid Mohammadi, Amir Ahooye Atashin, Damian A. Tamburri

$\ell_1$ regularization has been used for logistic regression to circumvent the overfitting and use the estimated sparse coefficient for feature selection. However, the challenge of such a regularization is that the $\ell_1$ norm is not differentiable, making the standard algorithms for convex optimization not applicable to this problem. This paper presents a simple projection neural network for $\ell_1$-regularized logistics regression. In contrast to many available solvers in the literature, the proposed neural network does not require any extra auxiliary variable nor any smooth approximation, and its complexity is almost identical to that of the gradient descent for logistic regression without $\ell_1$ regularization, thanks to the projection operator. We also investigate the convergence of the proposed neural network by using the Lyapunov theory and show that it converges to a solution of the problem with any arbitrary initial value. The proposed neural solution significantly outperforms state-of-the-art methods with respect to the execution time and is competitive in terms of accuracy and AUROC.

3.6SEMay 4, 2021

QSOC: Quantum Service-Oriented Computing

Indika Kumara, Willem-Jan Van Den Heuvel, Damian A. Tamburri

Quantum computing is quickly turning from a promise to a reality, witnessing the launch of several cloud-based, general-purpose offerings, and IDEs. Unfortunately, however, existing solutions typically implicitly assume intimate knowledge about quantum computing concepts and operators. This paper introduces Quantum Service-Oriented Computing (QSOC), including a model-driven methodology to allow enterprise DevOps teams to compose, configure and operate enterprise applications without intimate knowledge on the underlying quantum infrastructure, advocating knowledge reuse, separation of concerns, resource optimization, and mixed quantum- & conventional QSOC applications.

10.4SEMar 24, 2021

Automated Mapping of Vulnerability Advisories onto their Fix Commits in Open Source Repositories

Daan Hommersom, Antonino Sabetta, Bonaventura Coppola et al.

The lack of comprehensive sources of accurate vulnerability data represents a critical obstacle to studying and understanding software vulnerabilities (and their corrections). In this paper, we present an approach that combines heuristics stemming from practical experience and machine-learning (ML) - specifically, natural language processing (NLP) - to address this problem. Our method consists of three phases. First, an advisory record containing key information about a vulnerability is extracted from an advisory (expressed in natural language). Second, using heuristics, a subset of candidate fix commits is obtained from the source code repository of the affected project by filtering out commits that are known to be irrelevant for the task at hand. Finally, for each such candidate commit, our method builds a numerical feature vector reflecting the characteristics of the commit that are relevant to predicting its match with the advisory at hand. The feature vectors are then exploited for building a final ranked list of candidate fixing commits. The score attributed by the ML model to each feature is kept visible to the users, allowing them to interpret the predictions. We evaluated our approach using a prototype implementation named FixFinder on a manually curated data set that comprises 2,391 known fix commits corresponding to 1,248 public vulnerability advisories. When considering the top-10 commits in the ranked results, our implementation could successfully identify at least one fix commit for up to 84.03% of the vulnerabilities (with a fix commit on the first position for 65.06% of the vulnerabilities). In conclusion, our method reduces considerably the effort needed to search OSS repositories for the commits that fix known vulnerabilities.

6.4SEFeb 17, 2021Code

Automated Test-Case Generation for Solidity Smart Contracts: the AGSolT Approach and its Evaluation

Stefan Driessen, Dario Di Nucci, Geert Monsieur et al.

Blockchain and smart contract technology are novel approaches to data and code management that facilitate trusted computing by allowing for development in a distributed and decentralized manner. Testing smart contracts comes with its own set of challenges which have not yet been fully identified and explored. Although existing tools can identify and discover known vulnerabilities and their interactions on the Ethereum blockchain through random search or symbolic execution, these tools generally do not produce test suites suitable for human oracles. In this paper, we present AGSOLT (Automated Generator of Solidity Test Suites). We demonstrate its efficiency by implementing two search algorithms to automatically generate test suites for stand-alone Solidity smart contracts, taking into account some of the blockchain-specific challenges. To test AGSOLT, we compared a random search algorithm and a genetic algorithm on a set of 36 real-world smart contracts. We found that AGSOLT is capable of achieving high branch coverage with both approaches and even discovered some errors in some of the most popular Solidity smart contracts on Github.

30.5SEOct 7, 2020Code

Empirical Standards for Software Engineering Research

Paul Ralph, Nauman bin Ali, Sebastian Baltes et al.

Empirical Standards are natural-language models of a scientific community's expectations for a specific kind of study (e.g. a questionnaire survey). The ACM SIGSOFT Paper and Peer Review Quality Initiative generated empirical standards for research methods commonly used in software engineering. These living documents, which should be continuously revised to reflect evolving consensus around research best practices, will improve research quality and make peer review more effective, reliable, transparent and fair.

10.2CRJul 23, 2020

Blockchain and Cryptocurrencies: a Classification and Comparison of Architecture Drivers

Martin Garriga, Stefano Dalla Palma, Maximiliano Arias et al.

Blockchain is a decentralized transaction and data management solution, the technological leap behind the success of Bitcoin and other cryptocurrencies. As the variety of existing blockchains and distributed ledgers continues to increase, adopters should focus on selecting the solution that best fits their needs and the requirements of their decentralized applications, rather than developing yet another blockchain from scratch. In this paper we present a conceptual framework to aid software architects, developers, and decision makers to adopt the right blockchain technology. The framework exposes the interrelation between technological decisions and architectural features, capturing the knowledge from existing academic literature, industrial products, technical forums/blogs, and experts' feedback. We empirically show the applicability of our framework by dissecting the platforms behind Bitcoin and other top 10 cryptocurrencies, aided by a focus group with researchers and industry practitioners. Then, we leverage the framework together with key notions of the Architectural Tradeoff Analysis Method (ATAM) to analyze four real-world blockchain case studies from industry and academia. Results shown that applying our framework leads to a deeper understanding of the architectural tradeoffs, allowing to assess technologies more objectively and select the one that best fit developers needs, ultimately cutting costs, reducing time-to-market and accelerating return on investment.

7.3SEJul 4, 2020

Towards Semantic Detection of Smells in Cloud Infrastructure Code

Indika Kumara, Zoe Vasileiou, Georgios Meditskos et al.

Automated deployment and management of Cloud applications relies on descriptions of their deployment topologies, often referred to as Infrastructure Code. As the complexity of applications and their deployment models increases, developers inadvertently introduce software smells to such code specifications, for instance, violations of good coding practices, modular structure, and more. This paper presents a knowledge-driven approach enabling developers to identify the aforementioned smells in deployment descriptions. We detect smells with SPARQL-based rules over pattern-based OWL 2 knowledge graphs capturing deployment models. We show the feasibility of our approach with a prototype and three case studies.

3.0SEJun 22, 2020

Success and Failure in Software Engineering: a Followup Systematic Literature Review

Damian A. Tamburri, Fabio Palomba, Rick Kazman

Success and failure in software engineering are still among the least understood phenomena in the discipline. In a recent special journal issue on the topic, Mantyla et al. started discussing these topics from different angles; the authors focused their contributions on offering a general overview of both topics without deeper detail. Recognising the importance and impact of the topic, we have executed a followup, more in-depth systematic literature review with additional analyses beyond what was previously provided. These new analyses offer: (a) a grounded-theory of success and failure factors, harvesting over 500+ factors from the literature; (b) 14 manually-validated clusters of factors that provide relevant areas for success- and failure-specific measurement and risk-analysis; (c) a quality model composed of previously unmeasured organizational structure quantities which are germane to software product, process, and community quality. We show that the topics of success and failure deserve further study as well as further automated tool support, e.g., monitoring tools and metrics able to track the factors and patterns emerging from our study. This paper provides managers with risks as well as a more fine-grained analysis of the parameters that can be appraised to anticipate the risks.

5.3SEMay 27, 2020

Towards a Catalogue of Software Quality Metrics for Infrastructure Code

Stefano Dalla Palma, Dario Di Nucci, Fabio Palomba et al.

Infrastructure-as-code (IaC) is a practice to implement continuous deployment by allowing management and provisioning of infrastructure through the definition of machine-readable files and automation around them, rather than physical hardware configuration or interactive configuration tools. On the one hand, although IaC represents an ever-increasing widely adopted practice nowadays, still little is known concerning how to best maintain, speedily evolve, and continuously improve the code behind the IaC practice in a measurable fashion. On the other hand, source code measurements are often computed and analyzed to evaluate the different quality aspects of the software developed. However, unlike general-purpose programming languages (GPLs), IaC scripts use domain-specific languages, and metrics used for GPLs may not be applicable for IaC scripts. This article proposes a catalogue consisting of 46 metrics to identify IaC properties focusing on Ansible, one of the most popular IaC language to date, and shows how they can be used to analyze IaC scripts.

1.2CYApr 16, 2020

Organisational Structure Patterns in Agile Teams: An Industrial Empirical Study

Damian A. Tamburri, Rick Kazman, Hamed Fahimi

Forming members of an organization into coherent groups or communities is an important issue in any large-scale software engineering endeavour, especially so in agile software development teams which rely heavily on self-organisation and organisational flexibility. To address this problem, many researchers and practitioners have advocated a strategy of mirroring system structure and organisational structure, to simplify communication and coordination of collaborative work. But what are the patterns of organisation found in practice in agile software communities and how effective are those patterns? We address these research questions using mixed-methods research in industry, that is, interview surveys, focus-groups, and delphi studies of agile teams. In our study of 30 agile software organisations we found that, out of 7 organisational structure patterns that recur across our dataset, a single organisational pattern occurs over 37% of the time. This pattern: (a) reflects young communities (1-12 months old); (b) disappears in established ones (13+ months); (c) reflects the highest number of architecture issues reported. Finally, we observe a negative correlation between a proposed organisational measure and architecture issues. These insights may serve to aid architects in designing not only their architectures but also their communities to best support their co-evolution.

3.0SEMar 25, 2020

Quality Assurance of Heterogeneous Applications: The SODALITE Approach

Indika Kumara, Giovanni Quattrocchi, Damian Tamburri et al.

A key focus of the SODALITE project is to assure the quality and performance of the deployments of applications over heterogeneous Cloud and HPC environments. It offers a set of tools to detect and correct errors, smells, and bugs in the deployment models and their provisioning workflows, and a framework to monitor and refactor deployment model instances at runtime. This paper presents objectives, designs, early results of the quality assurance framework and the refactoring framework.

13.0CRNov 26, 2019

Blockchains: a Systematic Multivocal Literature Review

Bert-Jan Butijn, Damian A. Tamburri, Willem-Jan Van Den Heuvel

Blockchain technology has gained tremendous popularity both in practice and academia. The goal of this article is to develop a coherent overview of the state of the art in blockchain technology, using a systematic(i.e.,protocol-based, replicable), multivocal (i.e., featuring both white and grey literature alike) literature review, to (1) define blockchain technology (2) elaborate on its architecture options and (3) trade-offs, as well as understanding (4) the current applications and challenges, as evident from the state of the art. We derive a systematic definition of blockchain technology, based on a formal concept analysis. Further on, we flesh out an overview of blockchain technology elaborated by means of Grounded-Theory.