Fabio Petrillo

SE
37papers
545citations
Novelty21%
AI Score36

37 Papers

DCSep 25, 2023
SPIRT: A Fault-Tolerant and Reliable Peer-to-Peer Serverless ML Training Architecture

Amine Barrak, Mayssa Jaziri, Ranim Trabelsi et al.

The advent of serverless computing has ushered in notable advancements in distributed machine learning, particularly within parameter server-based architectures. Yet, the integration of serverless features within peer-to-peer (P2P) distributed networks remains largely uncharted. In this paper, we introduce SPIRT, a fault-tolerant, reliable, and secure serverless P2P ML training architecture. designed to bridge this existing gap. Capitalizing on the inherent robustness and reliability innate to P2P systems, SPIRT employs RedisAI for in-database operations, leading to an 82\% reduction in the time required for model updates and gradient averaging across a variety of models and batch sizes. This architecture showcases resilience against peer failures and adeptly manages the integration of new peers, thereby highlighting its fault-tolerant characteristics and scalability. Furthermore, SPIRT ensures secure communication between peers, enhancing the reliability of distributed machine learning tasks. Even in the face of Byzantine attacks, the system's robust aggregation algorithms maintain high levels of accuracy. These findings illuminate the promising potential of serverless architectures in P2P distributed machine learning, offering a significant stride towards the development of more efficient, scalable, and resilient applications.

DCSep 25, 2023
Exploring the Impact of Serverless Computing on Peer To Peer Training Machine Learning

Amine Barrak, Ranim Trabelsi, Fehmi Jaafar et al.

The increasing demand for computational power in big data and machine learning has driven the development of distributed training methodologies. Among these, peer-to-peer (P2P) networks provide advantages such as enhanced scalability and fault tolerance. However, they also encounter challenges related to resource consumption, costs, and communication overhead as the number of participating peers grows. In this paper, we introduce a novel architecture that combines serverless computing with P2P networks for distributed training and present a method for efficient parallel gradient computation under resource constraints. Our findings show a significant enhancement in gradient computation time, with up to a 97.34\% improvement compared to conventional P2P distributed training methods. As for costs, our examination confirmed that the serverless architecture could incur higher expenses, reaching up to 5.4 times more than instance-based architectures. It is essential to consider that these higher costs are associated with marked improvements in computation time, particularly under resource-constrained scenarios. Despite the cost-time trade-off, the serverless approach still holds promise due to its pay-as-you-go model. Utilizing dynamic resource allocation, it enables faster training times and optimized resource utilization, making it a promising candidate for a wide range of machine learning applications.

DCFeb 27, 2023
Architecting Peer-to-Peer Serverless Distributed Machine Learning Training for Improved Fault Tolerance

Amine Barrak, Fabio Petrillo, Fehmi Jaafar

Distributed Machine Learning refers to the practice of training a model on multiple computers or devices that can be called nodes. Additionally, serverless computing is a new paradigm for cloud computing that uses functions as a computational unit. Serverless computing can be effective for distributed learning systems by enabling automated resource scaling, less manual intervention, and cost reduction. By distributing the workload, distributed machine learning can speed up the training process and allow more complex models to be trained. Several topologies of distributed machine learning have been established (centralized, parameter server, peer-to-peer). However, the parameter server architecture may have limitations in terms of fault tolerance, including a single point of failure and complex recovery processes. Moreover, training machine learning in a peer-to-peer (P2P) architecture can offer benefits in terms of fault tolerance by eliminating the single point of failure. In a P2P architecture, each node or worker can act as both a server and a client, which allows for more decentralized decision making and eliminates the need for a central coordinator. In this position paper, we propose exploring the use of serverless computing in distributed machine learning training and comparing the performance of P2P architecture with the parameter server architecture, focusing on cost reduction and fault tolerance.

SEAug 11, 2020Code
Open Source Software Development Process: A Systematic Review

Bianca Minetto Napoleão, Fabio Petrillo, Sylvain Hallé

Open Source Software (OSS) has been recognized by the software development community as an effective way to deliver software. Unlike traditional software development, OSS development is driven by collaboration among developers spread geographically and motivated by common goals and interests. Besides this fact, it is recognized by OSS community the need of understand OSS development process and its activities. Our goal is to investigate the state-of-art about OSS process through conducting a systematic literature review providing an overview of how the OSS community has been investigating OSS process over past years identifying and summarizing OSS process activities and their characteristics as well as translating OSS process in a macro process through BPMN notation. As a result, we systematically analysed 33 studies presenting an overview of the state-of-art of researches regarding OSS process, a generalized OSS development macro process represented by BPMN notation with a detailed description of each OSS process activity and roles in OSS environment. We conclude that OSS process can be in practice further investigated by researchers. In addition, the presented OSS process can be used as a guide for OSS projects and being adapted according to each OSS project reality. It provides insights to managers and developers who want to improve their development process even in OSS and traditional environments. Finally, recommendations for OSS community regarding OSS process activities are provided.

SEAug 8, 2020Code
DR-Tools: a suite of lightweight open-source tools to measure and visualize Java source code

Guilherme Lacerda, Fabio Petrillo, Marcelo Pimenta

In Software Engineering, some of the most critical activities are maintenance and evolution. However, to perform both with quality, minimizing impacts and risks, developers need to analyze and identify where the main problems come from previously. In this paper, we introduce DR-Tools Suite, a set of lightweight open-source tools that analyze and calculate source code metrics, allowing developers to visualize the results in different formats and graphs. Also, we define a set of heuristics to help the code analysis. We conducted two case studies (one academic and one industrial) to collect feedback on the tools suite, on how we will evolve the tools, as well as insights to develop new tools that support developers in their daily work.

SEApr 12, 2020Code
Are Game Engines Software Frameworks? A Three-perspective Study

Cristiano Politowski, Fabio Petrillo, João Eduardo Montandon et al.

Game engines help developers create video games and avoid duplication of code and effort, like frameworks for traditional software systems. In this paper, we explore open-source game engines along three perspectives: literature, code, and human. First, we explore and summarise the academic literature on game engines. Second, we compare the characteristics of the 282 most popular engines and the 282 most popular frameworks in GitHub. Finally, we survey 124 engine developers about their experience with the development of their engines. We report that: (1) Game engines are not well-studied in software-engineering research with few studies having engines as object of research. (2) Open-source game engines are slightly larger in terms of size and complexity and less popular and engaging than traditional frameworks. Their programming languages differ greatly from frameworks. Engine projects have shorter histories with less releases. (3) Developers perceive game engines as different from traditional frameworks. Generally, they build game engines to (a) better control the environment and source code, (b) learn about game engines, and (c) develop specific games. We conclude that open-source game engines have differences compared to traditional open-source frameworks although this differences do not demand special treatments.

SEJan 25, 2019Code
Software Architecture Metrics: a literature review

Théo Coulin, Maxence Detante, William Mouchère et al.

In Software Engineering, early detection of architectural issues is key. It helps mitigate the risk of poor performance, and lowers the cost of repairing these issues. Metrics give a quick overview of the project which helps designers with the detection of flaws or degradation in their architecture. Even though studies unveiled architectural metrics more than 25 years ago, they have not yet been embraced by the industry nor the open source community. In this study, we aim at conducting a review of existing metrics focused on the software architecture for evaluating quality, early in the design flow and throughout the project's lifetime. We also give guidelines of their usage and study their relevance in different contexts.

14.6SEApr 7
Proof of Concept as a First-Class Architectural Decision Instrument

Bruno Fernando Antognolli, Fabio Petrillo

Proofs of Concept (PoCs) are widely adopted practices in software engineering. Despite their relevance, PoCs remain conceptually underdefined and methodologically ad hoc in both research and industry, with definitions and implementation approaches that often lack clarity and consistency. This paper investigates the concept of PoCs with two complementary goals: (1) to provide a refined definition and astructured framework for PoC development grounded in a systematic review of academic and grey literature; and (2) to position PoCs as first-class architectural decision instruments rather than informal experiments or disposable artifacts. Through a systematic review of academic and grey literature we identify the key characteristics, processes, associated with PoCs and expose a significant gap the academic literature describes PoC outcomes but rarely its process. By synthesizing insights from diverse sources we propose a refined definition and a lightweight, three-phase framework (planning, execution, decision-making) that encompasses technical validation and explicit decision traceability. We also introduce the Undocumented Architectural Experiment anti-pattern, arising when PoCs influence high-impact architectural decisions without leaving durable architectural knowledge. We argue that elevating PoCs to first-class status improves decision quality, enhances traceability, and supports more systematic learning in architectural practice.

SEFeb 25, 2022
Towards Automated Video Game Testing: Still a Long Way to Go

Cristiano Politowski, Yann-Gaël Guéhéneuc, Fabio Petrillo

As the complexity and scope of game development increase, playtesting remains an essential activity to ensure the quality of video games. Yet, the manual, ad-hoc nature of playtesting gives space to improvements in the process. In this study, we investigate gaps between academic solutions in the literature for automated video game testing and the needs of video game developers in the industry. We performed a literature review on video game automated testing and applied an online survey with video game developers. The literature results show a rise in research topics related to automated video game testing. The survey results show that game developers are skeptical about using automated agents to test games. We conclude that there is a need for new testing approaches that did not disrupt the developer workflow. As for the researchers, the focus should be on the testing goal and testing oracle.

SEFeb 13, 2022
Video Game Project Management Anti-patterns

Gabriel C. Ullmann, Cristiano Politowski, Yann-Gaël Guéhéneuc et al.

Project Management anti-patterns are well-documented in the software-engineering literature, and studying them allows understanding their impacts on teams and projects. The video game development industry is known for its mismanagement practices, and therefore applying this knowledge would help improving game developers' productivity and well-being. In this paper, we map project management anti-patterns to anti-patterns reported by game developers in the gray literature. We read 440 postmortems problems, identified anti-pattern candidates, and related them with definitions from the software-engineering literature. We discovered that most anti-pattern candidates could be mapped to anti-patterns in the software-engineering literature, except for Feature Creep, Feature Cuts, Working on Multiple Projects, and Absent or Inadequate Tools. We discussed the impact of the unmapped candidates on the development process while also drawing a parallel between video games and traditional software development. Future works include validating the definitions of the candidates via survey with practitioners and also considering development anti-patterns.

SEDec 22, 2021
Log severity level classification: an approach for systems in production

Eduardo Mendes, Fabio Petrillo

Context: Logs are often the primary source of information for system developers and operations engineers to understand and diagnose the behavior of a software system in production. In many cases, logs are the only evidence available for fault investigation. Problem: However, the inappropriate choice of log severity level can impact the amount of log data generated and, consequently, quality. This storage overhead can impact the performance of log-based monitoring systems, as excess log data comes with increased aggregate noise, making it challenging to utilize what is actually important when trying to do diagnostics. Goal: This research aims to decrease the overheads of monitoring systems by processing the severity level of log data from systems in production. Approach: To achieve this goal, we intend to deepen the knowledge about the log severity levels and develop an automated approach to log severity level classification, demonstrating that reducing log severity level "noise" improves the monitoring of systems in production. Conclusion: We hope that the set of contributions from this work can improve the monitoring activities of software systems and contribute to the creation of knowledge that improves logging practices

SESep 2, 2021
Log severity levels matter: A multivocal mapping

Eduardo Mendes, Fabio Petrillo

The choice of log severity level can be challenging and cause problems in producing reliable logging data. However, there is a lack of specifications and practical guidelines to support this challenge. In this study, we present a multivocal systematic mapping of log severity levels from peer-reviewed literature, logging libraries, and practitioners' views. We analyzed 19 severity levels, 27 studies, and 40 logging libraries. Our results show redundancy and semantic similarity between the levels and a tendency to converge the levels for a total of six levels. Our contributions help leverage the reliability of log entries: (i) mapping the literature about log severity levels, (ii) mapping the severity levels in logging libraries, (iii) a set of synthesized six definitions and four general purposes for severity levels. We recommend that developers use a standard nomenclature, and for logging library creators, we suggest providing accurate and unambiguous definitions of log severity levels.

SEAug 31, 2021
Mapping breakpoint types: an exploratory study

Eduardo Andreetta Fontana, Fabio Petrillo

Debugging is a relevant task for finding bugs during software development, maintenance, and evolution. During debugging, developers use modern IDE debuggers to analyze variables, step execution, and set breakpoints. Observing IDE debuggers, we find several breakpoint types. However, what are the breakpoint types? The goal of our study is to map the breakpoint types among IDEs and academic literature. Thus, we mapped the gray literature on the documentation of the nine main IDEs used by developers according to the three public rankings. In addition, we performed a systematic mapping of academic literature over 68 articles describing breakpoint types. Finally, we analyzed the developers understanding of the main breakpoint types through a questionnaire. We present three main contributions: (1) the mapping of breakpoint types (IDEs and literature), (2) compiled definitions of breakpoint types, (3) a breakpoint type taxonomy. Our contributions provide the first step to organize breakpoint IDE taxonomy and lexicon, and support further debugging research.

SEAug 29, 2021
Continuous Systematic Literature Review: An Approach for Open Science

Bianca Minetto Napoleão, Fabio Petrillo, Sylvain Hallé

Systematic Literature Reviews (SLRs) play an important role in the Evidence-Based Software Engineering scenario. With the advance of the computer science field and the growth of research publications, new evidence continuously arises. This fact impacts directly on the purpose of keeping SLRs up-to-date which could lead researchers to obsolete conclusions or decisions about a research problem or investigation. Creating and maintaining SLRs up-to-date demand a significant effort due to several reasons such as the rapid increase in the amount of evidence, limitation of available databases and lack of detailed protocol documentation and data availability. Conventionally, in software engineering SLRs are not updated or updated intermittently leaving gaps between updates during which time the SLR may be missing important new research. In order to address these issues, we propose the concept, process and tooling support of Continuous Systematic Literature Review (CSLR) in SE aiming to keep SLRs constantly updated with the promotion of open science practices. This positional paper summarizes our proposal and approach under development.

SEJun 25, 2021
Towards auto-completion on software requirements statements

Carlos Alberto dos Santos, Fabio Petrillo

As software systems become more complex, modern software development requires more attention to human perspectives, and active participation of development teams in requirements elicitation tasks. In this context, incomplete or ambiguous requirements descriptions do not guide the development of good software products. We hypothesize that the text auto-completion feature improves the quality of the software requirements artifacts. We present the motivation for this study, related works, our approach and future research efforts.

SEJun 6, 2021
Towards Logging Noisiness Theory: quality aspects to characterize unwanted log entries

Eduardo Mendes, Fabio Petrillo

Context: Logging tasks track the system's functioning by keeping records of evidence that have been analyzed by monitoring and observability activities. For these activities to be effective, it is necessary to consider the quality of the consumed information. Problem: However, the presence of noise - unwanted information - compromises the log files' quality. The noisiness of a log file can be affected among other things by: (i) the wrong severity log choices, (ii) the production of duplicate entries, (iii) the incompleteness of the information, (iv) the inappropriate format of the entries, (v) the amount of information generated. Objective: This work aims to broadly define the concept of noise in the context of logging, proposing the initial steps of Logging Noisiness, a theory on quality aspects to characterize unwanted log entries.

SEMay 28, 2021
What Makes a Game High-rated? Towards Factors of Video Game Success

Gabriel Ullmann, Cristiano Politowski, Yann-Gäel Guéhéneuc et al.

As the video game market grows larger, it becomes harder to stand out from the crowd. Launching a successful game involves different aspects. But what are they? In this paper, we investigate some aspects of the high-rated games from a dataset of 200 projects. The results show that the none of the aspects of this study have a strong relationship with the game's success. A further analysis on the high-rated games shows that team, technical, and game-design aspects should be the main focus of the game developers.

ROApr 22, 2021
Towards Automated Acceptance testing for industrial robots

Marcela G. dos Santos, Fabio Petrillo

Industrial robots are important machines applied in numerous modern industries that execute repetitive tasks with high accuracy, replacing or supporting dangerous jobs. In this kind of system, with increased complexity in which cost is related to the time the system keeps working, the system must operate with a minimum number of failures. In other words, a quality aspect important in industry is reliability. We hypothesize that Automated Acceptance Testing improves reliability for industrial robot program. We present the research question, the motivation for this study, our hypothesis and future research efforts.

SEMar 25, 2021
Towards improving architectural diagram consistency using system descriptors

Jalves Nicacio, Fabio Petrillo

Communication between practitioners is essential for the system's quality in the DevOps context. To improve this communication, practitioners often use informal diagrams to represent the components of a system. However, as systems evolve, it is a challenge to synchronize diagrams with production environments consistently. Hence, the inconsistency of architectural diagrams can affect communication between practitioner and their understanding of systems. In this paper, we propose the use of system descriptors to improve deployment diagram consistency. We state two main hypotheses: (1) if an architectural diagram is generated from a valid system descriptor, then the diagram is consistent; (2) if a valid system descriptor is generated from an architectural diagram, then the diagram is consistent. We report a case study to explore our hypotheses. Furthermore, we constructed a system descriptor from the Netflix deployment diagram, and we applied our tool to generate a new architectural diagram. Finally, we compare the original and generated diagrams to evaluate our proposal. Our case study shows all Docker compose description elements can be graphically represented in the generated architectural diagram, and the generated diagram does not present inconsistent aspects of the original diagram. Thus, our preliminary results lead to further evaluation in controlled and empirical experiments to test our hypotheses.

SEMar 11, 2021
A Survey of Video Game Testing

Cristiano Politowski, Fabio Petrillo, Yann-Gäel Guéhéneuc

Video-game projects are notorious for having day-one bugs, no matter how big their budget or team size. The quality of a game is essential for its success. This quality could be assessed and ensured through testing. However, to the best of our knowledge, little is known about video-game testing. In this paper, we want to understand how game developers perform game testing. We investigate, through a survey, the academic and gray literature to identify and report on existing testing processes and how they could automate them. We found that game developers rely, almost exclusively, upon manual play-testing and the testers' intrinsic knowledge. We conclude that current testing processes fall short because of their lack of automation, which seems to be the natural next step to improve the quality of games while maintaining costs. However, the current game-testing techniques may not generalize to different types of games.

ROFeb 24, 2021
Software Engineering for Robotic Systems:a systematic mapping study

Marcela G. dos Santos, Fabio Petrillo

Robots are being applied in a vast range of fields, leading researchers and practitioners to write tasks more complex than in the past. The robot software complexity increases the difficulty of engineering the robot's software components with quality requirements. Researchers and practitioners have applied software engineering (SE) approaches and robotic domains to address this issue in the last two decades. This study aims to identify, classify and evaluate the current state-of-the-art Software Engineering for Robotic Systems (SERS). We systematically selected and analyzed 50 primary studies extracted from an automated search on Scopus digital library and manual search on the two editions of the RoSE workshop. We present three main contributions. Firstly, we provide an analysis from three following perspectives: demographics of publication, SE areas applied in robotics domains, and RSE findings. Secondly, we show a catalogue of research studies that apply software engineering techniques in the robotic domain, classified with the SWEBOK guide. We have identified 5 of 15 software engineering areas from the SWEBOK guide applied explicitly in robotic domains. The majority of the studies focused on the development phase (design, models and methods and construction). Testing and quality software areas have little coverage in SERS. Finally, we identify research opportunities and gaps in software engineering for robotic systems for future studies.

SENov 4, 2020
What Skills do IT Companies look for in New Developers? A Study with Stack Overflow Jobs

João Eduardo Montandon, Cristiano Politowski, Luciana Lourdes Silva et al.

Context: There is a growing demand for information on how IT companies look for candidates to their open positions. Objective: This paper investigates which hard and soft skills are more required in IT companies by analyzing the description of 20,000 job opportunities. Method: We applied open card sorting to perform a high-level analysis on which types of hard skills are more requested. Further, we manually analyzed the most mentioned soft skills. Results: Programming languages are the most demanded hard skills. Communication, collaboration, and problem-solving are the most demanded soft skills. Conclusion: We recommend developers to organize their resumé according to the positions they are applying. We also highlight the importance of soft skills, as they appear in many job opportunities.

SESep 5, 2020
Are the Old Days Gone? A Survey on Actual Software Engineering Processes in Video Game Industry

Cristiano Politowski, Lisandra Fontoura, Fabio Petrillo et al.

In the past 10 years, several researches studied video game development process who proposed approaches to improve the way how games are developed. These approaches usually adopt agile methodologies because of claims that traditional practices and the waterfall process are gone. However, are the "old days" really gone in the game industry? In this paper, we present a survey of software engineering processes in video game industry from postmortem project analyses. We analyzed 20 postmortems from Gamasutra Portal. We extracted their processes and modelled them through using the Business Process Model and Notation (BPMN). This work presents three main contributions. First, a postmortem analysis methodology to identify and extract project processes. Second, the study main result: \textbf{the "old days" are gone, but not completely}. \textbf{Iterative practices} are increasing and are applied to at least \textbf{65\% of projects} in which \textbf{45\% of this projects} explicitly adopted Agile practices. However, \textbf{waterfall} process is still applied at least \textbf{30\% of projects}. Finally, we discuss some implications, directions and opportunities for video game development community.

SESep 5, 2020
Learning from the past: A process recommendation system for video game projects using postmortems experiences

Cristiano Politowski, Lisandra M. Fontoura, Fabio Petrillo et al.

Context: The video game industry is a billion dollar industry that faces problems in the way games are developed. One method to address these problems is using developer aid tools, such as Recommendation Systems. These tools assist developers by generating recommendations to help them perform their tasks. Objective: This article describes a systematic approach to recommend development processes for video game projects, using postmortem knowledge extraction and a model of the context of the new project, in which "postmortems" are articles written by video game developers at the end of projects, summarizing the experience of their game development team. This approach aims to provide reflections about development processes used in the game industry as well as guidance to developers to choose the most adequate process according to the contexts they're in. Method: Our approach is divided in three separate phases: in the the first phase, we manually extracted the processes from the postmortems analysis; in the second one, we created a video game context and algorithm rules for recommendation; and finally in the third phase, we evaluated the recommended processes by using quantitative and qualitative metrics, game developers feedback, and a case study by interviewing a video game development team. Contributions: This article brings three main contributions. The first describes a database of developers' experiences extracted from postmortems in the form of development processes. The second defines the main attributes that a video game project contain, which it uses to define the contexts of the project. The third describes and evaluates a recommendation system for video game projects, which uses the contexts of the projects to identify similar projects and suggest a set of activities in the form of a process.

SESep 5, 2020
Game Industry Problems: an Extensive Analysis of the Gray Literature

Cristiano Politowski, Fabio Petrillo, Gabriel C. Ullmann et al.

Context: Given its competitiveness, the video-game industry has a closed-source culture. Hence, little is known of the problems faced by game developers. However, game developers do share information about their games projects through postmortems, which describe informally what happened during the projects. Objective: The software-engineering research community and game developers would benefit from a state of the problems of the video game industry, in particular the problems faced by game developers, their evolution in time, and their root causes. This state of the practice would allow researchers and practitioners to work towards solving these problems. Method: We analyzed 200 postmortems from 1997 to 2019, resulting in 927 problems divided into 20 types. Through our analysis, we described the overall landscape of game industry problems in the past 23 years and how these problems evolved over the years. We also give details on the most common problems, their root causes, and possible solutions. We finally discuss suggestions for future projects. Results: We observe that (1) the game industry suffers from management and production problems in the same proportion; (2) management problems decreased over the years giving space to business problems, while production problems remained constant; (3a) technical and game design problems are decreasing over the years, the latter only after the last decade; (3b) problems related to the team increase over the last decade;(3c) marketing problems are the ones that had the biggest increase over the 23 years compared to other problem types; (4) finally, the majority of the main root causes are related to people, not technologies. Conclusions: In this paper we provide a state of the practice for researchers to understand and study video-game development problems. We also offer suggestions to help practitioners to avoid the most common problems.

SESep 5, 2020
A Large Scale Empirical Study of the Impact of Spaghetti Code and Blob Anti-patterns on Program Comprehension

Cristiano Politowski, Foutse Khomh, Simone Romano et al.

Context: Several studies investigated the impact of anti-patterns (i.e., "poor" solutions to recurring design problems) during maintenance activities and reported that anti-patterns significantly affect the developers' effort required to edit files. However, before developers edit files, they must understand the source code of the systems. This source code must be easy to understand by developers. Objective: In this work, we provide a complete assessment of the impact of two instances of two anti-patterns, Blob or Spaghetti Code, on program comprehension. Method: We analyze the impact of these two anti-patterns through three empirical studies conducted at Polytechnique Montréal (Canada) with 24 participants; at Carlton University (Canada) with 30 participants; and at University Basilicata (Italy) with 79 participants. Results: We collect data from 372 tasks obtained thanks to 133 different participants from the three universities. We use three metrics to assess the developers' comprehension of the source code: (1) the duration to complete each task; (2) their percentage of correct answers; and, (3) the NASA task load index for their effort. Conclusions: We report that, although single occurrences of Blob or Spaghetti code anti-patterns have little effect on code comprehension, two occurrences of either Blob or Spaghetti Code significantly increases the developers' time spent in their tasks, reduce their percentage of correct answers, and increase their effort. Hence, we recommend that developers act on both anti-patterns, which should be refactored out of the source code whenever possible. We also recommend further studies on combinations of anti-patterns rather than on single anti-patterns one at a time.

SEAug 25, 2020
Applying system descriptors to address ambiguity on deployment diagrams

Jalves Nicacio, Fabio Petrillo

Communication between practitioners is essential for product quality in the DevOps context. This communication often takes place through deployment diagrams of a system under development. However, it is common diagrams to become ambiguous or inconsistent as the system progresses and goes to a continuous delivery pipeline or production. Moreover, diagrams could not follow the evolution of systems, and it is challenging to associate diagrams to production. In this paper, we propose the use of system descriptors to address the ambiguity of deployment diagrams. We state three main hypotheses (1) if a deployment diagram is generated from a valid system descriptor then the diagram is unambiguous; (2) if a valid system descriptor is generated from a deployment diagram then the descriptor is unambiguous; (3) if a diagram $μ$ generated from a descriptor $A$ is unambiguous and if a descriptor $B$ is generated from the diagram $μ$ equally unambiguous then descriptors $A$ and $B$ are equivalent. We report a case study to test our hypotheses. We constructed a system descriptor from Netflix deployment diagram, and we applied our tool to generate a new deployment diagram. Finally, we compare the original and generated diagrams to evaluate our proposal. Our case study shows the generated deployment diagrams are graphically equivalent to system descriptors and eliminated ambiguous aspects of the original diagram. Thus, our preliminary results lead to further evaluation in controlled and empirical experiments to test our hypotheses conclusively.

SEApr 27, 2020
Internet of Things Architectures: A Comparative Study

Marcela G. dos Santos, Darine Ameyed, Fabio Petrillo et al.

Over the past two decades, the Internet of Things (IoT) has become an underlying concept to a variety of solutions and technologies that it is now hardly possible to enumerate and describe all of them. The concept behind the Internet of Things is as powerful as it is complex, and for the components in the IoT solution tomesh together perfectly, they all have to be part of a well-thought-out structure. That is where understanding the IoT architecture becomes paramount. Because of the vast domain of IoT, there is no single consensus on IoT architecture. Different researchers and organizations proposed different architectures under a variety of classifications, mainly: conceptual, standard and, industrial or commercial adoption. It is indispensable to make a systematic analysis of IoT architecture to be able to compare the industrial proposals and identify their similarities and their differences. In this work, we summarize information about seven IoT industrial architectures in order to propose an approach that makes possible a comparative analysis between different IoT architectures. This work presents two main contributions: (i) an approach for analyzing and comparing IoTarchitectures using Layer-Model; (ii) a comparative study of seven industrial IoT architectures.

SEApr 22, 2020
Code Smells and Refactoring: A Tertiary Systematic Review of Challenges and Observations

Guilherme Lacerda, Fabio Petrillo, Marcelo Pimenta et al.

In this paper, we present a tertiary systematic literature review of previous surveys, secondary systematic literature reviews, and systematic mappings. We identify the main observations (what we know) and challenges (what we do not know) on code smells and refactoring. We show that code smells and refactoring have a strong relationship with quality attributes, i.e., with understandability, maintainability, testability, complexity, functionality, and reusability. We argue that code smells and refactoring could be considered as the two faces of a same coin. Besides, we identify how refactoring affects quality attributes, more than code smells. We also discuss the implications of this work for practitioners, researchers, and instructors. We identify 13 open issues that could guide future research work. Thus, we want to highlight the gap between code smells and refactoring in the current state of software-engineering research. We wish that this work could help the software-engineering research community in collaborating on future work on code smells and refactoring.

SEApr 3, 2020
A Tertiary and Secondary Study Canvas

Bianca Minetto Napoleão, Fabio Petrillo, Sylvain Hallé

Over the past years, more secondary (Systematic Literature Reviews and Systematic Mappings) and tertiary studies have been conducted. Their conduction is considered a quite large task and labor-intensive since it involves a detailed process including a protocol development, which is one of the most challenging phase reported by the software engineering research community. In this scenario, we propose a Secondary and Tertiary Study Canvas aiming to simplify and clarify the understanding of the steps that need to be performed during the secondary and tertiary process conduction, including the protocol development. For this, we synthesized and organized the existing secondary studies' protocols in a Canvas format as well as suggesting a step-based approach to assist the secondary and tertiary studies' conduction.

SEMar 1, 2020
The cross cyclomatic complexity: a bi-dimensional measure for program complexity on graphs

Hugo Tremblay, Fabio Petrillo

Reduce and control complexity is an essential practice in software design. Cyclomatic complexity (CC) is one of the most popular software metrics, applied for more than 40 years. Despite CC is an interesting metric to highlight the number of branches in a program, it clearly not sufficient to represent the complexity in a piece of software. In this paper, we introduce the cross cyclomatic complexity (CCC), a new bi-dimensional complexity measure on graphs that combines the cyclomatic complexity and the weight of a minimum-weight cycle basis in as pair on the Cartesian plan to characterize program complexity using control flow graphs. Our postulates open a new venue to represent program complexity, and we discuss its implications and opportunities.

SEJan 2, 2020
Dataset of Video Game Development Problems

Cristiano Politowski, Fabio Petrillo, Gabriel Cavalheiro Ullmann et al.

Different from traditional software development, there is little information about the software-engineering process and techniques in video-game development. One popular way to share knowledge among the video-game developers' community is the publishing of postmortems, which are documents summarizing what happened during the video-game development project. However, these documents are written without formal structure and often providing disparate information. Through this paper, we provide developers and researchers with grounded dataset describing software-engineering problems in video-game development extracted from postmortems. We created the dataset using an iterative method through which we manually coded more than 200 postmortems spanning 20 years (1998 to 2018) and extracted 1,035 problems related to software engineering while maintaining traceability links to the postmortems. We grouped the problems in 20 different types. This dataset is useful to understand the problems faced by developers during video-game development, providing researchers and practitioners a starting point to study video-game development in the context of software engineering.

SEDec 18, 2019
Establishing a Search String to Detect Secondary Studies in Software Engineering

Bianca Minetto Napoleao, Katia Romero Felizardo, Erica Ferreira de Souza et al.

Context: A tertiary study can be performed to identify related reviews on a topic of interest. However, the elaboration of an appropriate and effective search string to detect secondary studies is challenging for Software Engineering (SE) researchers. Objective: The main goal of this study is to propose a suitable search string to detect secondary studies in SE, addressing issues such as the quantity of applied terms, relevance, recall and precision. Method: We analyzed seven tertiary studies under two perspectives: (1) structure -- strings' terms to detect secondary studies; and (2) field: where searching -- titles alone or abstracts alone or titles and abstracts together, among others. We validate our string by performing a two-step validation process. Firstly, we evaluated the capability to retrieve secondary studies over a set of 1537 secondary studies included in 24 tertiary studies in SE. Secondly, we evaluated the general capacity of retrieving secondary studies over an automated search using the Scopus digital library. Results: Our string was capable to retrieve an optimum value of over 90\% of the included secondary studies (recall) with a high general precision of almost 60\%. Conclusion: The suitable search string for finding secondary studies in SE contains the terms "systematic review", "literature review", "systematic mapping", "mapping study" and "systematic map".

SEFeb 10, 2019
Swarm Debugging: the Collective Intelligence on Interactive Debugging

Fabio Petrillo, Yann-Gaël Guéhéneuc, Marcelo Pimenta et al.

One of the most important tasks in software maintenance is debugging. To start an interactive debugging session, developers usually set breakpoints in an integrated development environment and navigate through different paths in their debuggers. We started our work by asking} what debugging information is useful to share among developers and study two pieces of information: breakpoints (and their locations) and sessions (debugging paths). To answer our question, we introduce the Swarm Debugging concept to frame the sharing of debugging information, the Swarm Debugging Infrastructure (SDI) with which practitioners and researchers can collect and share data about developers' interactive debugging sessions, and the Swarm Debugging Global View (GV) to display debugging paths. Using the SDI, we conducted a large study with professional developers to understand how developers set breakpoints. Using the GV, we also analyzed professional developers in two studies and collected data about their debugging sessions. Our observations and the answers to our research questions suggest that sharing and visualizing debugging data can support debugging activities.

SEJan 25, 2019
A quality model for evaluating and choosing a stream processing framework architecture

Youness Dendane, Fabio Petrillo, Hamid Mcheick et al.

Today, we have to deal with many data (Big data) and we need to make decisions by choosing an architectural framework to analyze these data coming from different area. Due to this, it become problematic when we want to process these data, and even more, when it is continuous data. When you want to process some data, you have to first receive it, store it, and then query it. This is what we call Batch Processing. It works well when you process big amount of data, but it finds its limits when you want to get fast (or real-time) processing results, such as financial trades, sensors, user session activity, etc. The solution to this problem is stream processing. Stream processing approach consists of data arriving record by record and rather than storing it, the processing should be done directly. Therefore, direct results are needed with a latency that may vary in real-time. In this paper, we propose an assessment quality model to evaluate and choose stream processing frameworks. We describe briefly different architectural frameworks such as Kafka, Spark Streaming and Flink that address the stream processing. Using our quality model, we present a decision tree to support engineers to choose a framework following the quality aspects. Finally, we evaluate our model doing a case study to Twitter and Netflix streaming.

SEJan 13, 2019
Serverless architecture efficiency: an exploratory study

Samuel Lavoie, Anthony Garant, Fabio Petrillo

Cloud service provider propose services to insensitive customers to use their platform. Different services can achieve the same result at different cost. In this paper, we study the efficiency of a serverless architecture for running highly parallelizable tasks to compare theses services in order to find the most efficient in term of performance and cost. More precisely, we look at the compute time and at the cost per task for a given task. The tasks studied is the count of the occurrence of a given word in a corpus. We compare the serverless architecture to the Apache Spark map reduce technique commonly used for this type of task. Using AWS Lambda for the serverless architecture and Amazon EMR for the Apache Spark map reduce, with similar compute power, we show that the serverless technique achieve comparable performance in term of compute time and cost. We observed that the lambda function is a great approach for real time computing, while EMR is preferable for task that require long compute time.

SEDec 21, 2018
Problems and Solutions of Continuous Deployment: A Systematic Review

Antoine Proulx, Francis Raymond, Bruno Roy et al.

Context: The software industry needs to adapt itself to a rapidly changing market. Continuous practices (Continuous Integration, Continuous Delivery and Continuous Deployment), commonly found in Agile development processes, it is possible to deliver new features more frequently to clients, integrating of smaller features is less likely to cause conflicts than the more traditional approach of merging big features less frequently all at once. However, Continuous Deployment is no clear way on the best approaches for their implementation. Objective: The goal of this paper is to identify the challenges and the solutions related to Continuous Deployment, and then see which of those solutions can be applied to which challenges. Method: This paper is a systematic literature review of the problems and the solutions found when implementing the continuous deployment practice inside an organization. It also presents which solution can be applied to which problem. Thirty-one articles published after 2015 were analyzed for this SLR. Results: 22 problems were grouped inside the categories Human and Organizational, Process, Tools, Infrastructure, Application Architecture and Testing. The 19 solutions found were grouped inside the categories Human and Organizational, Architecture, Process and Tools. Solutions have been found for 14 problems and some questions have been identified for future research. Conclusion: this article is to serve as a reference for the practitioner who wants to find how to solve a specific challenge when implementing the continuous deployment practice.