Naouel Moha

h-index28

14papers

54citations

Novelty30%

AI Score50

Ranked #22,195 of 194,257 authors (top 11%)#180 in SE (top 6%)

14 Papers

7.9SEMay 21Code

LLM Code Smells: A Taxonomy and Detection Approach

Zacharie Chenail-Larcher, Brahim Mahmoudi, Naouel Moha et al.

Large Language Models (LLMs) are increasingly integrated into software systems for diverse purposes, due to their versatility, flexibility, and ability to simulate human reasoning to some extent. However, poor integration of LLM inference in source code can undermine software system quality. Therefore, inadequate LLM integration coding practices must be documented to help developers mitigate such issues. Following our earlier work on LLM code smells, this paper consolidates and refines the concept by presenting a self-contained taxonomy and a catalog of nine LLM code smells. We also create SpecDetect4LLM, a static source code analysis tool for their detection, and conduct extensive empirical evaluations of its detection effectiveness (precision and recall) as well as the prevalence of LLM code smells across 692 open-source software projects (171,194 source files). Our results show that LLM code smells affect 73.5% of the analyzed systems, with a detection precision of 91.3% and a recall of 71.8%.

5.9SEDec 19, 2025Code

Specification and Detection of LLM Code Smells

Brahim Mahmoudi, Zacharie Chenail-Larcher, Naouel Moha et al.

Large Language Models (LLMs) have gained massive popularity in recent years and are increasingly integrated into software systems for diverse purposes. However, poorly integrating them in source code may undermine software system quality. Yet, to our knowledge, there is no formal catalog of code smells specific to coding practices for LLM inference. In this paper, we introduce the concept of LLM code smells and formalize five recurrent problematic coding practices related to LLM inference in software systems, based on relevant literature. We extend the detection tool SpecDetect4AI to cover the newly defined LLM code smells and use it to validate their prevalence in a dataset of 200 open-source LLM systems. Our results show that LLM code smells affect 60.50% of the analyzed systems, with a detection precision of 86.06%.

7.2SEMar 18Code

MLmisFinder: A Specification and Detection Approach of Machine Learning Service Misuses

Hadil Ben Amor, Niruthiha Selvanayagam, Manel Abdellatif et al.

Machine Learning (ML) cloud services, offered by leading providers such as Amazon, Google, and Microsoft, enable the integration of ML components into software systems without building models from scratch. However, the rapid adoption of ML services, coupled with the growing complexity of business requirements, has led to widespread misuses, compromising the quality, maintainability, and evolution of ML service-based systems. Though prior research has studied patterns and antipatterns in service-based and ML-based systems separately, automatic detection of ML service misuses remains a challenge. In this paper, we propose MLmisFinder, an automatic approach to detect ML service misuses in software systems, aiming to identify instances of improper use of ML services to help developers properly integrate ML components in ML service-based systems. We propose a metamodel that captures the data needed to detect misuses in ML service-based systems and apply a set of rule-based detection algorithms for seven misuse types. We evaluated MLmisFinder on 107 software systems collected from open-source GitHub repositories and compared it with a state-of-the-art baseline. Our results show that MLmisFinder effectively detects ML service misuses, achieving an average precision of 96.7\% and recall of 97\%, outperforming the state-of-the-art baseline. MLmisFinder also scaled efficiently to detect misuses across 817 ML service-based systems and revealed that such misuses are widespread, especially in areas such as data drift monitoring and schema validation.

6.0SEApr 12

DynamicsLLM: a Dynamic Analysis-based Tool for Generating Intelligent Execution Traces Using LLMs to Detect Android Behavioural Code Smells

Houcine Abdelkader Cherief, Florent Avellaneda, Naouel Moha

Mobile apps have become essential of our daily lives, making code quality a critical concern for developers. Behavioural code smells are characteristics in the source code that induce inappropriate code behaviour during execution, which negatively impact software quality in terms of performance, energy consumption, and memory. Dynamics, the latest state-of-the-art tool-based method, is highly effective at detecting Android behavioural code smells. While it outperforms static analysis tools, it suffers from a high false negative rate, with multiple code smell instances remaining undetected. Large Language Models (LLMs) have achieved notable advances across numerous research domains and offer significant potential for generating intelligent execution traces, particularly for detecting behavioural code smells in Android mobile applications. By intelligent execution trace, we mean a sequence of events generated by specific actions in a way that triggers the identification of a given behaviour. We propose the following three main contributions in this paper: (1) DynamicsLLM, an enhanced implementation of the Dynamics method that leverages LLMs to intelligently generate execution traces. (2) A novel hybrid approach designed to improve the coverage of code smell-related events in applications with a small number of activities. (3) A comprehensive validation of DynamicsLLM on 333 mobile applications from F-DROID, including a comparison with the Dynamics tool. Our results show that, under a limited number of actions, DynamicsLLM configured with 100% LLM covers three times more code smell-related events than Dynamics. The hybrid approach improves LLM coverage by 25.9% for apps containing few activities. Moreover, 12.7% of the code smell-related events that cannot be triggered by Dynamics are successfully triggered by our tool.

5.9SESep 24, 2025

AI-Specific Code Smells: From Specification to Detection

Brahim Mahmoudi, Naouel Moha, Quentin Stievenert et al.

The rise of Artificial Intelligence (AI) is reshaping how software systems are developed and maintained. However, AI-based systems give rise to new software issues that existing detection tools often miss. Among these, we focus on AI-specific code smells, recurring patterns in the code that may indicate deeper problems such as unreproducibility, silent failures, or poor model generalization. We introduce SpecDetect4AI, a tool-based approach for the specification and detection of these code smells at scale. This approach combines a high-level declarative Domain-Specific Language (DSL) for rule specification with an extensible static analysis tool that interprets and detects these rules for AI-based systems. We specified 22 AI-specific code smells and evaluated SpecDetect4AI on 826 AI-based systems (20M lines of code), achieving a precision of 88.66% and a recall of 88.89%, outperforming other existing detection tools. Our results show that SpecDetect4AI supports the specification and detection of AI-specific code smells through dedicated rules and can effectively analyze large AI-based systems, demonstrating both efficiency and extensibility (SUS 81.7/100).

8.6SEDec 19, 2021

Early Detection of Security-Relevant Bug Reports using Machine Learning: How Far Are We?

Arthur D. Sawadogo, Quentin Guimard, Tegawendé F. Bissyandé et al.

Bug reports are common artefacts in software development. They serve as the main channel for users to communicate to developers information about the issues that they encounter when using released versions of software programs. In the descriptions of issues, however, a user may, intentionally or not, expose a vulnerability. In a typical maintenance scenario, such security-relevant bug reports are prioritised by the development team when preparing corrective patches. Nevertheless, when security relevance is not immediately expressed (e.g., via a tag) or rapidly identified by triaging teams, the open security-relevant bug report can become a critical leak of sensitive information that attackers can leverage to perform zero-day attacks. To support practitioners in triaging bug reports, the research community has proposed a number of approaches for the detection of security-relevant bug reports. In recent years, approaches in this respect based on machine learning have been reported with promising performance. Our work focuses on such approaches, and revisits their building blocks to provide a comprehensive view on the current achievements. To that end, we built a large experimental dataset and performed extensive experiments with variations in feature sets and learning algorithms. Eventually, our study highlights different approach configurations that yield best performing classifiers.

3.0SEOct 14, 2020

Android Code Smells: From Introduction to Refactoring

Sarra Habchi, Naouel Moha, Romain Rouvoy

Object-oriented code smells are well-known concepts in software engineering that refer to bad design and development practices commonly observed in software systems. With the emergence of mobile apps, new classes of code smells have been identified by the research community as mobile-specific code smells. These code smells are presented as symptoms of important performance issues or bottlenecks. Despite the multiple empirical studies about these new code smells, their diffuseness and evolution along change histories remains unclear. We present in this article a large-scale empirical study that inspects the introduction, evolution, and removal of Android code smells. This study relies on data extracted from 324 apps, a manual analysis of 561 smell-removing commits, and discussions with 25 Android developers. Our findings reveal that the high diffuseness of mobile-specific code smells is not a result of releasing pressure. We also found that the removal of these code smells is generally a side effect of maintenance activities as developers do not refactor smell instances even when they are aware of them.

15.7SEJan 24, 2020

Learning to Catch Security Patches

Arthur D. Sawadogo, Tegawendé F. Bissyandé, Naouel Moha et al.

Timely patching is paramount to safeguard users and maintainers against dire consequences of malicious attacks. In practice, patching is prioritized following the nature of the code change that is committed in the code repository. When such a change is labeled as being security-relevant, i.e., as fixing a vulnerability, maintainers rapidly spread the change and users are notified about the need to update to a new version of the library or of the application. Unfortunately, oftentimes, some security-relevant changes go unnoticed as they represent silent fixes of vulnerabilities. In this paper, we propose a Co-Training-based approach to catch security patches as part of an automatic monitoring service of code repositories. Leveraging different classes of features, we empirically show that such automation is feasible and can yield a precision of over 90% in identifying security patches, with an unprecedented recall of over 80%. Beyond such a benchmarking with ground truth data which demonstrates an improvement over the state-of-the-art, we confirmed that our approach can help catch security patches that were not reported as such.

2.8SEJun 3, 2019

Service-Oriented Re-engineering of Legacy JEE Applications: Issues and Research Directions

Hafedh Mili, Ghizlane El-Boussaidi, Anas Shatnawi et al.

Service-orientation views applications as orchestrations of independent software services that (1) implement functions that are reusable across many applications, (2) can be invoked remotely, and (3) are packaged to decouple potential callers from their implementation technology. As such, it enables organizations to develop quality applications faster than without services. Legacy applications are not service-oriented. Yet, they implement many reusable functions that could be exposed as \emph{services}. Organizations face three main issues when re-engineering legacy application to (re)use services: (1) to mine their existing applications for reusable functions that can become services, (2) to package those functions into services, and (3) to refactor legacy applications to invoke those services to ease future maintenance. In this paper, we explore these three issues and propose research directions to address them. We choose to focus on the service-oriented re-engineering of recent legacy object-oriented applications, and more specifically, on JEE applications, for several reasons. First, we wanted to focus on architectural challenges, and thus we choose to \textit{not} have to deal with programming language difference between source and target system. We chose JEE applications, in particular, because they embody the range of complexities that one can encounter in recent legacy applications, namely, multi-language systems, multi-tier applications, the reliance on external configuration files, and the reliance on frameworks and container services during runtime. These characteristics pose unique challenges for the three issues mentioned above.

11.1SEJun 3, 2019

Static Code Analysis of Multilanguage Software Systems