DCMay 5
ClusterLess: Deadline-Aware Serverless Workflow Orchestration on Federated Edge ClustersReza Farahani, Mario Colosi, Ilir Murturi et al.
The recent convergence of edge computing, serverless execution, and Kubernetes (K8s) based container orchestration has enabled the processing of application workflows close to data sources. While effective within a single edge cluster, existing schemes do not generalize to federated multi edge environments, where multiple workflows execute concurrently under strict end to end (E2E) deadline constraints. This paper introduces ClusterLess, a deadline aware serverless workflow orchestration method for federated multi edge K8s clusters. ClusterLess manages the E2E lifecycle of workflow execution, including dependency analysis, execution mode selection, and resource aware placement. To this end, it integrates structured intra cluster orchestration with a leader selected, super master driven intercluster coordination layer, determining where and how each workflow function should be executed across the federated edge clusters. We implement ClusterLess using OpenFaaS as the serverless execution substrate and Argo for workflow management, and deploy it on a realistic testbed of six edge clusters comprising 64 heterogeneous edge nodes. Experimental results with concurrent serverless workflows, spanning 18 workload configurations across different input sizes and deadline classes, show that ClusterLess reduces workflow completion time by up to 40 %, increases deadline satisfaction from below 50 % to over 90 %, and confines deadline violations to single digit seconds compared to four baseline methods.
DCMay 19Code
FedADAS: Communication-Efficient Federated Distillation for On-Device Driver Yawn Recognition in Vehicular NetworksAhmed Mujtaba, Gleb Radchenko, Marc Masana et al.
Driver fatigue is a critical safety concern in advanced driver assistance systems. Driver monitoring models trained off-site on static datasets adapt poorly to real-world conditions, while standard federated learning imposes high communication overhead, assumes homogeneous architectures, and struggles with personalized driver data. We present FedADAS, a federated distillation framework enabling collaborative on-device learning across heterogeneous vehicular networks. FedADAS enables full model heterogeneity by exchanging only soft logits on a shared public dataset, allowing each vehicle to run a customized model tailored to its computational constraints. Additionally, we introduce a yawn recognition pipeline supporting training and inference on edge devices that provides two robust architectures: Performance-Efficient (99.7 MB) achieving 98.3% F1-score with 1.99ms inference time on a Jetson NANO, and a Memory-Efficient (0.6 MB) that trains an epoch in 6.12 minutes on a Jetson AGX Orin. In experiments with up to 115 edge clients, FedADAS significantly outperforms traditional federated learning approaches at higher client participation, achieving up to 9974x reduction in communication cost while maintaining a superior tradeoff between personalization and generalization under extreme data heterogeneity, demonstrating its suitability for real-world deployment. Code is available at https://opensource.silicon-austria.com/mujtabaa/fedadas
IVMar 13
DQ-Ladder: A Deep Reinforcement Learning-based Bitrate Ladder for Adaptive Video StreamingReza Farahani, Zoha Azimi, Vignesh V Menon et al.
Adaptive streaming of segmented video over HTTP typically relies on a predefined set of bitrate-resolution pairs, known as a bitrate ladder. However, fixed ladders often overlook variations in content and decoding complexities, leading to suboptimal trade-offs between encoding time, decoding efficiency, and video quality. This article introduces DQ-Ladder, a deep reinforcement learning (DRL)-based scheme for constructing time- and quality-aware bitrate ladders for adaptive video streaming applications. DQ-Ladder employs predicted decoding time, quality scores, and bitrate levels per segment as inputs to a Deep Q-Network (DQN) agent, guided by a weighted reward function of decoding time, video quality, and resolution smoothness. We leverage machine learning models to predict decoding time, bitrate level, and objective quality metrics (VMAF, XPSNR), eliminating the need for exhaustive encoding or quality metric computation. We evaluate DQ-Ladder using the Versatile Video Coding (VVC) toolchain (VVenC/VVdeC) on 750 video sequences across six Apple HLS-compliant resolutions and 41 quantization parameters. Experimental results against four baselines show that DQ-Ladder achieves BD-rate reductions of at least 10.3% for XPSNR compared to the HLS ladder, while reducing decoding time by 22%. DQ-Ladder shows significantly lower sensitivity to prediction errors than competing methods, remaining robust even with up to 20% noise.
DCOct 29, 2023
Comparison of Microservice Call Rate Predictions for Replication in the CloudNarges Mehran, Arman Haghighi, Pedram Aminharati et al.
Today, many users deploy their microservice-based applications with various interconnections on a cluster of Cloud machines, subject to stochastic changes due to dynamic user requirements. To address this problem, we compare three machine learning (ML) models for predicting the microservice call rates based on the microservice times and aiming at estimating the scalability requirements. We apply the linear regression (LR), multilayer perception (MLP), and gradient boosting regression (GBR) models on the Alibaba microservice traces. The prediction results reveal that the LR model reaches a lower training time than the GBR and MLP models. However, the GBR reduces the mean absolute error and the mean absolute percentage error compared to LR and MLP models. Moreover, the prediction results show that the required number of replicas for each microservice by the gradient boosting model is close to the actual test data without any prediction.
LGDec 28, 2025
Osmotic Learning: A Self-Supervised Paradigm for Decentralized Contextual Data RepresentationMario Colosi, Reza Farahani, Maria Fazio et al.
Data within a specific context gains deeper significance beyond its isolated interpretation. In distributed systems, interdependent data sources reveal hidden relationships and latent structures, representing valuable information for many applications. This paper introduces Osmotic Learning (OSM-L), a self-supervised distributed learning paradigm designed to uncover higher-level latent knowledge from distributed data. The core of OSM-L is osmosis, a process that synthesizes dense and compact representation by extracting contextual information, eliminating the need for raw data exchange between distributed entities. OSM-L iteratively aligns local data representations, enabling information diffusion and convergence into a dynamic equilibrium that captures contextual patterns. During training, it also identifies correlated data groups, functioning as a decentralized clustering mechanism. Experimental results confirm OSM-L's convergence and representation capabilities on structured datasets, achieving over 0.99 accuracy in local information alignment while preserving contextual integrity.
DCJul 29, 2022
Decentralized Machine Learning for Intelligent Health Care Systems on the Computing ContinuumDragi Kimovski, Sasko Ristov, Radu Prodan
The introduction of electronic personal health records (EHR) enables nationwide information exchange and curation among different health care systems. However, the current EHR systems do not provide transparent means for diagnosis support, medical research or can utilize the omnipresent data produced by the personal medical devices. Besides, the EHR systems are centrally orchestrated, which could potentially lead to a single point of failure. Therefore, in this article, we explore novel approaches for decentralizing machine learning over distributed ledgers to create intelligent EHR systems that can utilize information from personal medical devices for improved knowledge extraction. Consequently, we proposed and evaluated a conceptual EHR to enable anonymous predictive analysis across multiple medical institutions. The evaluation results indicate that the decentralized EHR can be deployed over the computing continuum with reduced machine learning time of up to 60% and consensus latency of below 8 seconds.
LGAug 20, 2025Code
Federated Distillation on Edge Devices: Efficient Client-Side Filtering for Non-IID DataAhmed Mujtaba, Gleb Radchenko, Radu Prodan et al.
Federated distillation has emerged as a promising collaborative machine learning approach, offering enhanced privacy protection and reduced communication compared to traditional federated learning by exchanging model outputs (soft logits) rather than full model parameters. However, existing methods employ complex selective knowledge-sharing strategies that require clients to identify in-distribution proxy data through computationally expensive statistical density ratio estimators. Additionally, server-side filtering of ambiguous knowledge introduces latency to the process. To address these challenges, we propose a robust, resource-efficient EdgeFD method that reduces the complexity of the client-side density ratio estimation and removes the need for server-side filtering. EdgeFD introduces an efficient KMeans-based density ratio estimator for effectively filtering both in-distribution and out-of-distribution proxy data on clients, significantly improving the quality of knowledge sharing. We evaluate EdgeFD across diverse practical scenarios, including strong non-IID, weak non-IID, and IID data distributions on clients, without requiring a pre-trained teacher model on the server for knowledge distillation. Experimental results demonstrate that EdgeFD outperforms state-of-the-art methods, consistently achieving accuracy levels close to IID scenarios even under heterogeneous and challenging conditions. The significantly reduced computational overhead of the KMeans-based estimator is suitable for deployment on resource-constrained edge devices, thereby enhancing the scalability and real-world applicability of federated distillation. The code is available online for reproducibility.
CVDec 12, 2025
YawDD+: Frame-level Annotations for Accurate Yawn PredictionAhmed Mujtaba, Gleb Radchenko, Marc Masana et al.
Driver fatigue remains a leading cause of road accidents, with 24% of crashes involving drowsy drivers. While yawning serves as an early behavioral indicator of fatigue, existing machine learning approaches face significant challenges due to video-annotated datasets that introduce systematic noise from coarse temporal annotations. We develop a semi-automated labeling pipeline with human-in-the-loop verification, which we apply to YawDD, enabling more accurate model training. Training the established MNasNet classifier and YOLOv11 detector architectures on YawDD+ improves frame accuracy by up to 6% and mAP by 5% over video-level supervision, achieving 99.34% classification accuracy and 95.69% detection mAP. The resulting approach deliver up to 59.8 FPS on edge AI hardware (NVIDIA Jetson Nano), confirming that enhanced data quality alone supports on-device yawning monitoring without server-side computation.
LGOct 1, 2025
Fiaingen: A financial time series generative method matching real-world data qualityJože M. Rožanec, Tina Žezlin, Laurentiu Vasiliu et al.
Data is vital in enabling machine learning models to advance research and practical applications in finance, where accurate and robust models are essential for investment and trading decision-making. However, real-world data is limited despite its quantity, quality, and variety. The data shortage of various financial assets directly hinders the performance of machine learning models designed to trade and invest in these assets. Generative methods can mitigate this shortage. In this paper, we introduce a set of novel techniques for time series data generation (we name them Fiaingen) and assess their performance across three criteria: (a) overlap of real-world and synthetic data on a reduced dimensionality space, (b) performance on downstream machine learning tasks, and (c) runtime performance. Our experiments demonstrate that the methods achieve state-of-the-art performance across the three criteria listed above. Synthetic data generated with Fiaingen methods more closely mirrors the original time series data while keeping data generation time close to seconds - ensuring the scalability of the proposed approach. Furthermore, models trained on it achieve performance close to those trained with real-world data.