Marcelo de Almeida Maia

SE
4papers
58citations
Novelty35%
AI Score22

4 Papers

SEJul 20, 2021Code
On the Interplay of Smells Large Class, Complex Class and Duplicate Code

Elder Vicente de Paulo Sobrinho, Marcelo de Almeida Maia

Bad smells have been defined to describe potential problems in code, possibly pointing out refactoring opportunities. Several empirical studies have highlighted that smells have a negative impact on comprehension and maintainability. Consequently, several approaches have been proposed to detect and restructure them. However, studies on the inter-relationship of occurrence of different types of smells in source code are still lacking, especially those focused on the quantification of this inter-relationship. In this work, we aim at understand and quantify the possible the inter-relation of smells Large Class - LC, Complex Class - CC and Duplicate Code - DC. In particular, we investigate patterns of LC and CC regarding the presence or absence of duplicate code. We conduct a quantitative study on five open source projects, and also a qualitative analysis to measure and understand the association of specific smells. As one of the main results, we highlight that there are "occurrence patterns" among these smells, for example: either in Complex Class or in the co-occurrence of Large Class and Complex Class, clones tend to be more prevalent in highly complex classes than less complex classes. The found patterns could be used to improve the performance of detection tools or even help in refactoring tasks.

SEMar 17, 2021
Towards a question answering assistant for software development using a transformer-based language model

Liliane do Nascimento Vale, Marcelo de Almeida Maia

Question answering platforms, such as Stack Overflow, have impacted substantially how developers search for solutions for their programming problems. The crowd knowledge content available from such platforms has also been used to leverage software development tools. The recent advances on Natural Language Processing, specifically on more powerful language models, have demonstrated ability to enhance text understanding and generation. In this context, we aim at investigating the factors that can influence on the application of such models for understanding source code related data and produce more interactive and intelligent assistants for software development. In this preliminary study, we particularly investigate if a how-to question filter and the level of context in the question may impact the results of a question answering transformer-based model. We suggest that fine-tuning models with corpus based on how-to questions can impact positively in the model and more contextualized questions also induce more objective answers.

SEMar 21, 2019
Bootstrapping Cookbooks for APIs from Crowd Knowledge on Stack Overflow

Lucas B. L. Souza, Eduardo C. Campos, Fernanda Madeiral et al.

Well established libraries typically have API documentation. However, they frequently lack examples and explanations, possibly making difficult their effective reuse. Stack Overflow is a question-and-answer website oriented to issues related to software development. Despite the increasing adoption of Stack Overflow, the information related to a particular topic (e.g., an API) is spread across the website. Thus, Stack Overflow still lacks organization of the crowd knowledge available on it. Our target goal is to address the problem of the poor quality documentation for APIs by providing an alternative artifact to document them based on the crowd knowledge available on Stack Overflow, called crowd cookbook. A cookbook is a recipe-oriented book, and we refer to our cookbook as crowd cookbook since it contains content generated by a crowd. The cookbooks are meant to be used through an exploration process, i.e. browsing. In this paper, we present a semi-automatic approach that organizes the crowd knowledge available on Stack Overflow to build cookbooks for APIs. We have generated cookbooks for three APIs widely used by the software development community: SWT, LINQ and QT. We have also defined desired properties that crowd cookbooks must meet, and we conducted an evaluation of the cookbooks against these properties with human subjects. The results showed that the cookbooks built using our approach, in general, meet those properties. As a highlight, most of the recipes were considered appropriate to be in the cookbooks and have self-contained information. We concluded that our approach is capable to produce adequate cookbooks automatically, which can be as useful as manually produced cookbooks. This opens an opportunity for API designers to enrich existent cookbooks with the different points of view from the crowd, or even to generate initial versions of new cookbooks.

SEMar 18, 2019
Recommending Comprehensive Solutions for Programming Tasks by Mining Crowd Knowledge

Rodrigo F. G. Silva, Chanchal K. Roy, Mohammad Masudur Rahman et al.

Developers often search for relevant code examples on the web for their programming tasks. Unfortunately, they face two major problems. First, the search is impaired due to a lexical gap between their query (task description) and the information associated with the solution. Second, the retrieved solution may not be comprehensive, i.e., the code segment might miss a succinct explanation. These problems make the developers browse dozens of documents in order to synthesize an appropriate solution. To address these two problems, we propose CROKAGE (Crowd Knowledge Answer Generator), a tool that takes the description of a programming task (the query) and provides a comprehensive solution for the task. Our solutions contain not only relevant code examples but also their succinct explanations. Our proposed approach expands the task description with relevant API classes from Stack Overflow Q&A threads and then mitigates the lexical gap problems. Furthermore, we perform natural language processing on the top quality answers and then return such programming solutions containing code examples and code explanations unlike earlier studies. We evaluate our approach using 48 programming queries and show that it outperforms six baselines including the state-of-art by a statistically significant margin. Furthermore, our evaluation with 29 developers using 24 tasks (queries) confirms the superiority of CROKAGE over the state-of-art tool in terms of relevance of the suggested code examples, benefit of the code explanations and the overall solution quality (code + explanation).