Romain Rouvoy

SE
h-index6
12papers
252citations
Novelty46%
AI Score43

12 Papers

SEJul 31, 2024
A Performance Study of LLM-Generated Code on Leetcode

Tristan Coignion, Clément Quinton, Romain Rouvoy

This study evaluates the efficiency of code generation by Large Language Models (LLMs) and measures their performance against human-crafted solutions using a dataset from Leetcode. We compare 18 LLMs, considering factors such as model temperature and success rate, and their impact on code performance. This research introduces a novel method for measuring and comparing the speed of LLM-generated code, revealing that LLMs produce code with comparable performance, irrespective of the adopted LLM. We also find that LLMs are capable of generating code that is, on average, more efficient than the code written by humans. The paper further discusses the use of Leetcode as a benchmarking dataset, the limitations imposed by potential data contamination, and the platform's measurement reliability. We believe that our findings contribute to a better understanding of LLM capabilities in code generation and set the stage for future optimizations in the field.

DBMar 16
DOT: Dynamic Knob Selection and Online Sampling for Automated Database Tuning

Yifan Wang, Debabrota Basu, Pierre Bourhis et al.

Database Management Systems (DBMS) are crucial for efficient data management and access control, but their administration remains challenging for Database Administrators (DBAs). Tuning, in particular, is known to be difficult. Modern systems have many tuning parameters, but only a subset significantly impacts performance. Focusing on these influential parameters reduces the search space and optimizes performance. Current methods rely on costly warm-up phases and human expertise to identify important tuning parameters. In this paper, we present DOT, a dynamic knob selection and online sampling DBMS tuning algorithm. DOT uses Recursive Feature Elimination with Cross-Validation (RFECV) to prune low-importance tuning parameters and a Likelihood Ratio Test (LRT) strategy to balance exploration and exploitation. For parameter search, DOT uses a Bayesian Optimization (BO) algorithm to optimize configurations on-the-fly, eliminating the need for warm-up phases or prior knowledge (although existing knowledge can be incorporated). Experiments show that DOT achieves matching or outperforming performance compared to state-of-the-art tuners while substantially reducing tuning overhead.

LGFeb 6
Pimp My LLM: Leveraging Variability Modeling to Tune Inference Hyperparameters

Nada Zine, Clément Quinton, Romain Rouvoy

Large Language Models (LLMs) are being increasingly used across a wide range of tasks. However, their substantial computational demands raise concerns about the energy efficiency and sustainability of both training and inference. Inference, in particular, dominates total compute usage, making its optimization crucial. Recent research has explored optimization techniques and analyzed how configuration choices influence energy consumption. Yet, the vast configuration space of inference servers makes exhaustive empirical evaluation infeasible due to combinatorial explosion. In this paper, we introduce a new perspective on this problem by treating LLMs as configurable systems and applying variability management techniques to systematically analyze inference-time configuration choices. We evaluate our approach on the Hugging Face Transformers library by representing generation hyperparameters and their constraints using a feature-based variability model, sampling representative configurations, measuring their energy consumption, latency, accuracy, and learning predictive models from the collected data. Our results show that variability modeling effectively manages the complexity of LLM inference configurations. It enables systematic analysis of hyperparameters effects and interactions, reveals trade-offs, and supports accurate prediction of inference behavior from a limited number of measurements. Overall, this work opens a new research direction that bridges software engineering and machine learning by leveraging variability modeling for the efficient and sustainable configuration of LLMs.

SENov 7, 2024
Green My LLM: Studying the key factors affecting the energy consumption of code assistants

Tristan Coignion, Clément Quinton, Romain Rouvoy

In recent years,Large Language Models (LLMs) have significantly improved in generating high-quality code, enabling their integration into developers' Integrated Development Environments (IDEs) as code assistants. These assistants, such as GitHub Copilot, deliver real-time code suggestions and can greatly enhance developers' productivity. However, the environmental impact of these tools, in particular their energy consumption, remains a key concern. This paper investigates the energy consumption of LLM-based code assistants by simulating developer interactions with GitHub Copilot and analyzing various configuration factors. We collected a dataset of development traces from 20 developers and conducted extensive software project development simulations to measure energy usage under different scenarios. Our findings reveal that the energy consumption and performance of code assistants are influenced by various factors, such as the number of concurrent developers, model size, quantization methods, and the use of streaming. Notably, a substantial portion of generation requests made by GitHub Copilot is either canceled or rejected by developers, indicating a potential area for reducing wasted computations. Based on these findings, we share actionable insights into optimizing configurations for different use cases, demonstrating that careful adjustments can lead to significant energy savings.

CRJan 24, 2022
DRAWNAPART: A Device Identification Technique based on Remote GPU Fingerprinting

Tomer Laor, Naif Mehanna, Antonin Durey et al.

Browser fingerprinting aims to identify users or their devices, through scripts that execute in the users' browser and collect information on software or hardware characteristics. It is used to track users or as an additional means of identification to improve security. In this paper, we report on a new technique that can significantly extend the tracking time of fingerprint-based tracking methods. Our technique, which we call DrawnApart, is a new GPU fingerprinting technique that identifies a device based on the unique properties of its GPU stack. Specifically, we show that variations in speed among the multiple execution units that comprise a GPU can serve as a reliable and robust device signature, which can be collected using unprivileged JavaScript. We investigate the accuracy of DrawnApart under two scenarios. In the first scenario, our controlled experiments confirm that the technique is effective in distinguishing devices with similar hardware and software configurations, even when they are considered identical by current state-of-the-art fingerprinting algorithms. In the second scenario, we integrate a one-shot learning version of our technique into a state-of-the-art browser fingerprint tracking algorithm. We verify our technique through a large-scale experiment involving data collected from over 2,500 crowd-sourced devices over a period of several months and show it provides a boost of up to 67% to the median tracking duration, compared to the state-of-the-art method. DrawnApart makes two contributions to the state of the art in browser fingerprinting. On the conceptual front, it is the first work that explores the manufacturing differences between identical GPUs and the first to exploit these differences in a privacy context. On the practical front, it demonstrates a robust technique for distinguishing between machines with identical hardware and software configurations.

SEAug 12, 2021
Can We Spot Energy Regressions using Developers Tests?

Benjamin Danglot, Jean-Rémy Falleri, Romain Rouvoy

Software Energy Consumption(SEC) is gaining more and more attention. In this paper, we tackle the problem of hinting developers about the SEC of their programs in the context of software developments based on Continuous Integration(CI). In this study, we investigate if the CI can leverage developers' tests to perform a new class of tests: the energy regression testing. Energy regression is similar to performance regression but focused on the energy consumption of the program instead of standard performance indicators, like execution time or memory consumption. We propose to perform an exploratory study of the usage of developers' tests for energy regression testing. We propose to first investigate if developers' tests can be used to obtain stable SEC indicators. Then, to consider if comparing the SEC of developers' tests between two versions can accurately spot energy regressions introduced by automated program mutations. Finally, to assess if it can successfully pinpoint the source code lines guilty of energy regressions. Our study will pave the way for automated SEC regression tools that can be readily deployed inside an existing CI infrastructure to raise awareness of SEC issues among practitioners.

SEJun 9, 2021
Erratum: Leveraging Flexible Tree Matching to Repair Broken Locators in Web Automation Scripts

Sacha Brisset, Romain Rouvoy, Lionel Seinturier et al.

Web applications are constantly evolving to integrate new features and fix reported bugs. Even an imperceptible change can sometimes entail significant modifications of the Document Object Model (DOM), which is the underlying model used by browsers to render all the elements included in a web application. Scripts that interact with web applications (e.g. web test scripts, crawlers, or robotic process automation) rely on this continuously evolving DOM which means they are often particularly fragile. More precisely, the major cause of breakages observed in automation scripts are element locators, which are identifiers used by automation scripts to navigate across the DOM. When the DOM evolves, these identifiers tend to break, thus causing the related scripts to no longer locate the intended target elements. For this reason, several contributions explored the idea of automatically repairing broken locators on a page. These works attempt to repair a given broken locator by scanning all elements in the new DOM to find the most similar one. Unfortunately, this approach fails to scale when the complexity of web pages grows, leading either to long computation times or incorrect element repairs. This article, therefore, adopts a different perspective on this problem by introducing a new locator repair solution that leverages tree matching algorithms to relocate broken locators. This solution, named Erratum, implements a holistic approach to reduce the element search space, which greatly eases the locator repair task and drastically improves repair accuracy. We compare the robustness of Erratum on a large-scale benchmark composed of realistic and synthetic mutations applied to popular web applications currently deployed in production. Our empirical results demonstrate that Erratum outperforms the accuracy of WATER, a state-of-the-art solution, by 67%.

CRFeb 28, 2021
An iterative technique to identify browser fingerprinting scripts

Antonin Durey, Pierre Laperdrix, Walter Rudametkin et al.

Browser fingerprinting is a stateless identification technique based on browser properties. Together, they form an identifier that can be collected without users' notice and has been studied to be unique and stable. As this technique relies on browser properties that serve legitimate purposes, the detection of this technique is challenging. While several studies propose classification techniques, none of these are publicly available, making them difficult to reproduce. This paper proposes a new browser fingerprinting detection technique. Based on an incremental process, it relies on both automatic and manual decisions to be both reliable and fast. The automatic step matches API calls similarities between scripts while the manual step is required to classify a script with different calls. We publicly share our algorithm and implementation to improve the general knowledge on the subject.

SEOct 14, 2020
Android Code Smells: From Introduction to Refactoring

Sarra Habchi, Naouel Moha, Romain Rouvoy

Object-oriented code smells are well-known concepts in software engineering that refer to bad design and development practices commonly observed in software systems. With the emergence of mobile apps, new classes of code smells have been identified by the research community as mobile-specific code smells. These code smells are presented as symptoms of important performance issues or bottlenecks. Despite the multiple empirical studies about these new code smells, their diffuseness and evolution along change histories remains unclear. We present in this article a large-scale empirical study that inspects the introduction, evolution, and removal of Android code smells. This study relies on data extracted from 324 apps, a manual analysis of 561 smell-removing commits, and discussions with 25 Android developers. Our findings reveal that the high diffuseness of mobile-specific code smells is not a result of releasing pressure. We also found that the removal of these code smells is generally a side effect of maintenance activities as developers do not refactor smell instances even when they are aware of them.

DBApr 27, 2020
SFTM: Fast Comparison of Web Documents using Similarity-based Flexible Tree Matching

Sacha Brisset, Romain Rouvoy, Renaud Pawlak et al.

Tree matching techniques have been investigated in many fields, including web data mining and extraction, as a key component to analyze the content of web documents, existing tree matching approaches, like Tree-Edit Distance (TED) or Flexible Tree Matching (FTM), fail to scale beyond a few hundreds of nodes, which is far below the average complexity of existing web online documents and applications. In this paper, we therefore propose a novel Similarity-based Flexible Tree Matching algorithm (SFTM), which is the first algorithm to enable tree matching on real-life web documents with practical computation times. In particular, we approach tree matching as an optimisation problem and we leverage node labels and local topology similarity in order to avoid any combinatorial explosion. Our practical evaluation demonstrates that our approach compares to the reference implementation of TED qualitatively, while improving the computation times by two orders of magnitude.

SEJul 2, 2018
App Store 2.0: From Crowd Information to Actionable Feedback in Mobile Ecosystems

María Gómez, Bram Adams, Walid Maalej et al.

Given the increasing competition in mobile app ecosystems, improving the experience of users has become a major goal for app vendors. This article introduces a visionary app store, called APP STORE 2.0, which exploits crowdsourced information about apps, devices and users to increase the overall quality of the delivered mobile apps. We sketch a blueprint architecture of the envisioned app stores and discuss the different kinds of actionable feedbacks that app stores can generate using crowdsourced information.

DCMay 4, 2018
SecureStreams: A Reactive Middleware Framework for Secure Data Stream Processing

Aurélien Havet, Rafael Pires, Pascal Felber et al.

The growing adoption of distributed data processing frameworks in a wide diversity of application domains challenges end-to-end integration of properties like security, in particular when considering deployments in the context of large-scale clusters or multi-tenant Cloud infrastructures. This paper therefore introduces SecureStreams, a reactive middleware framework to deploy and process secure streams at scale. Its design combines the high-level reactive dataflow programming paradigm with Intel's low-level software guard extensions (SGX) in order to guarantee privacy and integrity of the processed data. The experimental results of SecureStreams are promising: while offering a fluent scripting language based on Lua, our middleware delivers high processing throughput, thus enabling developers to implement secure processing pipelines in just few lines of code.