Francisco Durán

SE
h-index6
4papers
42citations
Novelty18%
AI Score18

4 Papers

SEJul 19, 2023
Towards green AI-based software systems: an architecture-centric approach (GAISSA)

Silverio Martínez-Fernández, Xavier Franch, Francisco Durán

Nowadays, AI-based systems have achieved outstanding results and have outperformed humans in different domains. However, the processes of training AI models and inferring from them require high computational resources, which pose a significant challenge in the current energy efficiency societal demand. To cope with this challenge, this research project paper describes the main vision, goals, and expected outcomes of the GAISSA project. The GAISSA project aims at providing data scientists and software engineers tool-supported, architecture-centric methods for the modelling and development of green AI-based systems. Although the project is in an initial stage, we describe the current research results, which illustrate the potential to achieve GAISSA objectives.

LGFeb 12, 2024
Identifying architectural design decisions for achieving green ML serving

Francisco Durán, Silverio Martínez-Fernández, Matias Martinez et al.

The growing use of large machine learning models highlights concerns about their increasing computational demands. While the energy consumption of their training phase has received attention, fewer works have considered the inference phase. For ML inference, the binding of ML models to the ML system for user access, known as ML serving, is a critical yet understudied step for achieving efficiency in ML applications. We examine the literature in ML architectural design decisions and Green AI, with a special focus on ML serving. The aim is to analyze ML serving architectural design decisions for the purpose of understanding and identifying them with respect to quality characteristics from the point of view of researchers and practitioners in the context of ML serving literature. Our results (i) identify ML serving architectural design decisions along with their corresponding components and associated technological stack, and (ii) provide an overview of the quality characteristics studied in the literature, including energy efficiency. This preliminary study is the first step in our goal to achieve green ML serving. Our analysis may aid ML researchers and practitioners in making green-aware architecture design decisions when serving their models.

SEDec 19, 2024
Insights into resource utilization of code small language models serving with runtime engines and execution providers

Francisco Durán, Matias Martinez, Patricia Lago et al.

The rapid growth of language models, particularly in code generation, requires substantial computational resources, raising concerns about energy consumption and environmental impact. Optimizing language models inference resource utilization is crucial, and Small Language Models (SLMs) offer a promising solution to reduce resource demands. Our goal is to analyze the impact of deep learning serving configurations, defined as combinations of runtime engines and execution providers, on resource utilization, in terms of energy consumption, execution time, and computing-resource utilization from the point of view of software engineers conducting inference in the context of code generation SLMs. We conducted a technology-oriented, multi-stage experimental pipeline using twelve code generation SLMs to investigate energy consumption, execution time, and computing-resource utilization across the configurations. Significant differences emerged across configurations. CUDA execution provider configurations outperformed CPU execution provider configurations in both energy consumption and execution time. Among the configurations, TORCH paired with CUDA demonstrated the greatest energy efficiency, achieving energy savings from 37.99% up to 89.16% compared to other serving configurations. Similarly, optimized runtime engines like ONNX with the CPU execution provider achieved from 8.98% up to 72.04% energy savings within CPU-based configurations. Also, TORCH paired with CUDA exhibited efficient computing-resource utilization. Serving configuration choice significantly impacts resource utilization. While further research is needed, we recommend the above configurations best suited to software engineers' requirements for enhancing serving resource utilization efficiency.

SEJan 17, 2019
Multilevel Coupled Model Transformations for Precise and Reusable Definition of Model Behaviour

Fernando Macías, Uwe Wolter, Adrian Rutle et al.

The use of Domain-Specific Languages (DSLs) is a promising field for the development of tools tailored to specific problem spaces, effectively diminishing the complexity of hand-made software. With the goal of making models as precise, simple and reusable as possible, we augment DSLs with concepts from multilevel modelling, where the number of abstraction levels are not limited. This is particularly useful for DSL definitions with behaviour, whose concepts inherently belong to different levels of abstraction. Here, models can represent the state of the modelled system and evolve using model transformations. These transformations can benefit from a multilevel setting, becoming a precise and reusable definition of the semantics for behavioural modelling languages. We present in this paper the concept of Multilevel Coupled Model Transformations, together with examples, formal definitions and tools to assess their conceptual soundness and practical value.