DCNov 6, 2025
AIvailable: A Software-Defined Architecture for LLM-as-a-Service on Heterogeneous and Legacy GPUsPedro Antunes, Ana Rita Ortigoso, Gabriel Vieira et al.
The rise of Large Language Models (LLM) has increased the need for scalable, high-performance inference systems, yet most existing frameworks assume homogeneous, resource-rich hardware, often unrealistic in academic, or resource-constrained settings. We introduce AIvailable, a low-cost, highly available LLM-as-a-Service (LLMaaS) platform, that uses a software-defined approach for running LLMs across heterogeneous and legacy GPU nodes, including NVIDIA and AMD devices, with a focus on fully utilizing each node's VRAM. AIvailable operates as a fully GPU-accelerated inference without CPU fallbacks, featuring a unified client interface that allows seamless interaction with all deployed LLMs through a single logical unit. The architecture comprises four main components: the Client Interface for user access, the Service Frontend for secure request routing and load balancing, the SDAI Controller for orchestration, deployment, and monitoring, and the Service Backend of heterogeneous GPU nodes executing workloads. By abstracting GPU-specific details and providing dynamic, VRAM-aware allocation and reallocation of models, AIvailable ensures efficient use of resources and resilience against failures or workload fluctuations. Targeting academic labs, private companies, and other constrained organizations, it supports diverse open LLMs helping democratize generative AI through the repurposing of legacy GPUs.
NIAug 31, 2025
Quantum-based QoE Optimization in Advanced Cellular Networks: Integration and Cloud Gaming Use CaseFatma Chaouech, Javier Villegas, António Pereira et al.
This work explores the integration of Quantum Machine Learning (QML) and Quantum-Inspired (QI) techniques for optimizing end-to-end (E2E) network services in telecommunication systems, particularly focusing on 5G networks and beyond. The application of QML and QI algorithms is investigated, comparing their performance with classical Machine Learning (ML) approaches. The present study employs a hybrid framework combining quantum and classical computing leveraging the strengths of QML and QI, without the penalty of quantum hardware availability. This is particularized for the optimization of the Quality of Experience (QoE) over cellular networks. The framework comprises an estimator for obtaining the expected QoE based on user metrics, service settings, and cell configuration, and an optimizer that uses the estimation to choose the best cell and service configuration. Although the approach is applicable to any QoE-based network management, its implementation is particularized for the optimization of network configurations for Cloud Gaming services. Then, it is evaluated via performance metrics such as accuracy and model loading and inference times for the estimator, and time to solution and solution score for the optimizer. The results indicate that QML models achieve similar or superior accuracy to classical ML models for estimation, while decreasing inference and loading times. Furthermore, potential for better performance is observed for higher-dimensional data, highlighting promising results for higher complexity problems. Thus, the results demonstrate the promising potential of QML in advancing network optimization, although challenges related to data availability and integration complexities between quantum and classical ML are identified as future research lines.
CVJul 10, 2025
Lightweight Cloud Masking Models for On-Board Inference in Hyperspectral ImagingMazen Ali, António Pereira, Fabio Gentile et al.
Cloud and cloud shadow masking is a crucial preprocessing step in hyperspectral satellite imaging, enabling the extraction of high-quality, analysis-ready data. This study evaluates various machine learning approaches, including gradient boosting methods such as XGBoost and LightGBM as well as convolutional neural networks (CNNs). All boosting and CNN models achieved accuracies exceeding 93%. Among the investigated models, the CNN with feature reduction emerged as the most efficient, offering a balance of high accuracy, low storage requirements, and rapid inference times on both CPUs and GPUs. Variations of this version, with only up to 597 trainable parameters, demonstrated the best trade-off in terms of deployment feasibility, accuracy, and computational efficiency. These results demonstrate the potential of lightweight artificial intelligence (AI) models for real-time hyperspectral image processing, supporting the development of on-board satellite AI systems for space-based applications.
HCFeb 12, 2022
Complete Inertial Pose Dataset: from raw measurements to pose with low-cost and high-end MARG sensorsManuel Palermo, Sara Cerqueira, João André et al.
The use of wearable technology for posture monitoring has been expanding due to its low-intrusiveness and compliance with daily use requirements. However, there are still open challenges limiting its widespread use, especially when dealing with low-cost systems. Most solutions falls either into fully functioning commercial products with high costs, or ad-hoc solutions with lower performance. Moreover, there are few datasets available, from which complete and general solutions can be derived. This work presents 2 datasets, containing low-cost and high-end Magnetic, Angular Rate, and Gravity (MARG) sensor data respectively. It provides data for the analysis of the complete inertial pose pipeline, from raw measurements, to sensor-to-segment calibration, multi-sensor fusion, skeleton kinematics, to the complete human pose. Multiple trials were collected with 21 and 10 subjects respectively, performing 6 types of sequences (ranging from calibration, to daily-activities and random movements). It presents a high degree of variability and complex dynamics with almost complete range-of-motion, while containing common sources of error found on real conditions. This amounts to 3.5M samples, synchronized with a ground-truth inertial motion capture system at 60hz. A simple end-to-end inertial pose method was briefly described and used to validate the quality of the data in both acquisitions. This database may contribute to assess, benchmark and develop novel algorithms for each of the pipelines' processing steps, with applications in classic or data-driven inertial pose estimation algorithms, human movement understanding and forecasting and ergonomic assessment in industrial or rehabilitation settings. All the data is freely available on an online database and accompanied with code to process and analyze the complete data pipeline.