SEMay 29
Beyond Strict Rules: Assessing the Effectiveness of Large Language Models for Code Smell DetectionSaymon Souza, Amanda Santana, Eduardo Figueiredo et al.
Code smells are symptoms of potential code quality problems that may affect software maintainability, thus increasing development costs and impacting software reliability. Large language models (LLMs) have shown remarkable capabilities for supporting various software engineering activities, but their use for detecting code smells remains underexplored. However, unlike the rigid rules of static analysis tools, LLMs can support flexible and adaptable detection strategies tailored to the unique properties of code smells. This paper evaluates the effectiveness of four LLMs -- DeepSeek-R1, GPT-5 mini, Llama-3.3, and Qwen2.5-Code -- for detecting nine code smells across 30 Java projects. For the empirical evaluation, we created a ground-truth dataset by asking 76 developers to manually inspect 268 code-smell candidates. Our results indicate that LLMs perform strongly for structurally straightforward smells, such as Large Class and Long Method. However, we also observed that different LLMs and tools fare better for distinct code smells. We then propose and evaluate a detection strategy that combines LLMs and static analysis tools. The proposed strategy outperforms LLMs and tools in five out of nine code smells in terms of F1-Score. However, it also generates more false positives for complex smells. Therefore, we conclude that the optimal strategy depends on whether Recall or Precision is the main priority for code smell detection.
LGMar 11
Evaluating randomized smoothing as a defense against adversarial attacks in trajectory predictionJulian F. Schumann, Eduardo Figueiredo, Frederik Baymler Mathiesen et al.
Accurate and robust trajectory prediction is essential for safe and efficient autonomous driving, yet recent work has shown that even state-of-the-art prediction models are highly vulnerable to inputs being mildly perturbed by adversarial attacks. Although model vulnerabilities to such attacks have been studied, work on effective countermeasures remains limited. In this work, we develop and evaluate a new defense mechanism for trajectory prediction models based on randomized smoothing -- an approach previously applied successfully in other domains. We evaluate its ability to improve model robustness through a series of experiments that test different strategies of randomized smoothing. We show that our approach can consistently improve prediction robustness of multiple base trajectory prediction models in various datasets without compromising accuracy in non-adversarial settings. Our results demonstrate that randomized smoothing offers a simple and computationally inexpensive technique for mitigating adversarial attacks in trajectory prediction.
SEFeb 4, 2025
Evaluating the Effectiveness of LLMs in Fixing Maintainability Issues in Real-World ProjectsHenrique Nunes, Eduardo Figueiredo, Larissa Rocha et al.
Large Language Models (LLMs) have gained attention for addressing coding problems, but their effectiveness in fixing code maintainability remains unclear. This study evaluates LLMs capability to resolve 127 maintainability issues from 10 GitHub repositories. We use zero-shot prompting for Copilot Chat and Llama 3.1, and few-shot prompting with Llama only. The LLM-generated solutions are assessed for compilation errors, test failures, and new maintainability problems. Llama with few-shot prompting successfully fixed 44.9% of the methods, while Copilot Chat and Llama zero-shot fixed 32.29% and 30%, respectively. However, most solutions introduced errors or new maintainability issues. We also conducted a human study with 45 participants to evaluate the readability of 51 LLM-generated solutions. The human study showed that 68.63% of participants observed improved readability. Overall, while LLMs show potential for fixing maintainability issues, their introduction of errors highlights their current limitations.
SEApr 20, 2021
Understanding the Context of IoT Software Systems in DevOpsIgor Muzetti Pereira, Tiago Garcia de Senna Carneiro, Eduardo Figueiredo
The growing demand for connected devices and the increase in investments in the Internet of Things (IoT) sector induce the growth of the market for this technology. IoT permeates all areas of life of an individual, from smartwatches to entire home assistants and solutions in different areas. The IoT concept is gradually increasing all over the globe. IoT projects induce an articulation of studies in software engineering to prepare the development and operation of software systems materialized in physical objects and structures interconnected with embedded software and hosted in clouds. IoT projects have boundaries between development and operation stages. This study search for evidence in scientific literature to support these boundaries through Development and Operations (DevOps) principles. We rely on a Systematic Literature Review to investigate the relations of DevOps in IoT software systems. As a result, we identify concepts, characterize the benefits and challenges in the context of knowledge previously reported in primary studies in the literature. The main contributions of this paper are: (i) discussion of benefits and challenges for DevOps in IoT software systems, (ii) identification of tools, concepts, and programming languages used, and, (iii) perceived pipeline for this kind of software development.
SEApr 4, 2020
Efficiently Finding Higher-Order MutantsChu-Pan Wong, Jens Meinicke, Leo Chen et al.
Higher-order mutation has the potential for improving major drawbacks of traditional first-order mutation, such as by simulating more realistic faults or improving test optimization techniques. Despite interest in studying promising higher-order mutants, such mutants are difficult to find due to the exponential search space of mutation combinations. State-of-the-art approaches rely on genetic search, which is often incomplete and expensive due to its stochastic nature. First, we propose a novel way of finding a complete set of higher-order mutants by using variational execution, a technique that can, in many cases, explore large search spaces completely and often efficiently. Second, we use the identified complete set of higher-order mutants to study their characteristics. Finally, we use the identified characteristics to design and evaluate a new search strategy, independent of variational execution, that is highly effective at finding higher-order mutants even in large code bases.
SEOct 24, 2019
How Do Code Changes Evolve in Different Platforms? A Mining-based InvestigationMarkos Viggiato, Johnatan Oliveira, Eduardo Figueiredo et al.
Code changes are performed differently in the mobile and non-mobile platforms. Prior work has investigated the differences in specific platforms. However, we still lack a deeper understanding of how code changes evolve across different software platforms. In this paper, we present a study aiming at investigating the frequency of changes and how source code, build and test changes co-evolve in mobile and non-mobile platforms. We developed regression models to explain which factors influence the frequency of changes and applied the Apriori algorithm to find types of changes that frequently co-occur. Our findings show that non-mobile repositories have a higher number of commits per month and our regression models suggest that being mobile significantly impacts on the number of commits in a negative direction when controlling for confound factors, such as code size. We also found that developers do not usually change source code files together with build or test files. We argue that our results can provide valuable information for developers on how changes are performed in different platforms so that practices adopted in successful software systems can be followed.
SEAug 14, 2018
Microservices in Practice: A Survey StudyMarkos Viggiato, Ricardo Terra, Henrique Rocha et al.
Microservices architectures have become largely popular in the last years. However, we still lack empirical evidence about the use of microservices and the practices followed by practitioners. Thereupon, in this paper, we report the results of a survey with 122 professionals who work with microservices. We report how the industry is using this architectural style and whether the perception of practitioners regarding the advantages and challenges of microservices is according to the literature.