NISep 18, 2023
Double Deep Q-Learning-based Path Selection and Service Placement for Latency-Sensitive Beyond 5G ApplicationsMasoud Shokrnezhad, Tarik Taleb, Patrizio Dazzi
Nowadays, as the need for capacity continues to grow, entirely novel services are emerging. A solid cloud-network integrated infrastructure is necessary to supply these services in a real-time responsive, and scalable way. Due to their diverse characteristics and limited capacity, communication and computing resources must be collaboratively managed to unleash their full potential. Although several innovative methods have been proposed to orchestrate the resources, most ignored network resources or relaxed the network as a simple graph, focusing only on cloud resources. This paper fills the gap by studying the joint problem of communication and computing resource allocation, dubbed CCRA, including function placement and assignment, traffic prioritization, and path selection considering capacity constraints and quality requirements, to minimize total cost. We formulate the problem as a non-linear programming model and propose two approaches, dubbed B\&B-CCRA and WF-CCRA, based on the Branch \& Bound and Water-Filling algorithms to solve it when the system is fully known. Then, for partially known systems, a Double Deep Q-Learning (DDQL) architecture is designed. Numerical simulations show that B\&B-CCRA optimally solves the problem, whereas WF-CCRA delivers near-optimal solutions in a substantially shorter time. Furthermore, it is demonstrated that DDQL-CCRA obtains near-optimal solutions in the absence of request-specific information.
NIMar 27, 2025
Declarative Traffic Engineering for Low-Latency and Reliable NetworkingJacopo Massa, Stefano Forti, Federica Paganelli et al.
Cloud-Edge applications like industrial control systems and connected vehicles demand stringent end-to-end latency guarantees. Among existing data plane candidate solutions for bounded latency networking, the guaranteed Latency-Based Forwarding (gLBF) approach ensures punctual delivery of traffic flows by managing per-hop delays to meet specific latency targets, while not requiring that per-flow states are maintained at each hop. However, as a forwarding plane mechanism, gLBF does not define the control mechanisms for determining feasible forwarding paths and per-hop latency budgets for packets to fulfil end-to-end latency objectives. In this work, we propose such a control mechanism implemented in Prolog that complies with gLBF specifications, called declarative gLBF (dgLBF). The declarative nature of Prolog allows our prototype to be concise (~120 lines of code) and easy to extend. We show how the core dgLBF implementation is extended to add reliability mechanisms, path protection, and fate-sharing avoidance to enhance fault tolerance and robustness. Finally, we evaluate the system's performance through simulative experiments under different network topologies and with increasing traffic load to simulate saturated network conditions, scaling up to 6000 flows. Our results show a quasi-linear degradation in placement times and system resilience under heavy traffic.
DCMay 12
Trade-offs in Decentralized Agentic AI Discovery Across the Compute ContinuumPatrizio Dazzi, Emanuele Carlini, Matteo Mordacchini et al.
Agentic systems deployed across the compute continuum need discovery mechanisms that remain effective across cloud, edge, and intermittently connected domains. In some emerging agentic architectures, decentralized discovery is already an active design direction, placing DHT-based lookup on the path toward agent directories. This paper studies the trade-offs among major structured-overlay families for agent discovery, comparing Chord, Pastry, and Kademlia as candidate indexing substrates within a shared control-plane framework. Using a benchmark subset centered on a 4096-node stationary comparison and a representative 4096-node churn benchmark, the paper characterizes how discovery reliability, startup behavior, and control-plane overhead vary across these overlays. The goal is to clarify the operating points they expose for agent discovery across edge-to-cloud environments.
DCJan 14
High-Performance Serverless Computing: A Systematic Literature Review on Serverless for HPC, AI, and Big DataValerio Besozzi, Matteo Della Bartola, Patrizio Dazzi et al.
The widespread deployment of large-scale, compute-intensive applications such as high-performance computing, artificial intelligence, and big data is leading to convergence between cloud and high-performance computing infrastructures. Cloud providers are increasingly integrating high-performance computing capabilities in their infrastructures, such as hardware accelerators and high-speed interconnects, while researchers in the high-performance computing community are starting to explore cloud-native paradigms to improve scalability, elasticity, and resource utilization. In this context, serverless computing emerges as a promising execution model to efficiently handle highly dynamic, parallel, and distributed workloads. This paper presents a comprehensive systematic literature review of 122 research articles published between 2018 and early 2025, exploring the use of the serverless paradigm to develop, deploy, and orchestrate compute-intensive applications across cloud, high-performance computing, and hybrid environments. From these, a taxonomy comprising eight primary research directions and nine targeted use case domains is proposed, alongside an analysis of recent publication trends and collaboration networks among authors, highlighting the growing interest and interconnections within this emerging research field. Overall, this work aims to offer a valuable foundation for both new researchers and experienced practitioners, guiding the development of next-generation serverless solutions for parallel compute-intensive applications.
MAApr 25
Usable Agent Discovery for Decentralized AI SystemsPatrizio Dazzi, Emanuele Carlini, Matteo Mordacchini et al.
Large-scale agentic systems run on distributed infrastructures where many software agents share physical hosts and are discovered via peer-to-peer mechanisms. Discovery must handle node-level churn from failures and host departures and agent-level churn from demand-driven activation, deactivation, and state changes. Their interaction reshapes classic trade-offs between structured and unstructured overlays. We study decentralized agent discovery under this two-level churn, assuming nodes host multiple agents, overlays are structured or gossip-based, and agents switch between warm and cold states. Using Kademlia as a structured and Cyclon+Vicinity as a gossip baseline, we compare stable, node-churn-only, agent-cooling-only, and combined regimes to see when routing efficiency, resilience, and service readiness align or favor different designs. Structured overlays are more robust and efficient in stable and node-churn regimes, while gossip-based overlays remain competitive and can be faster when readiness dominates.
DCApr 23, 2024
Graph Neural Networks and Reinforcement Learning for Proactive Application Image PlacementAntonios Makris, Theodoros Theodoropoulos, Evangelos Psomakelis et al.
The shift from Cloud Computing to a Cloud-Edge continuum presents new opportunities and challenges for data-intensive and interactive applications. Edge computing has garnered a lot of attention from both industry and academia in recent years, emerging as a key enabler for meeting the increasingly strict demands of Next Generation applications. In Edge computing the computations are placed closer to the end-users, to facilitate low-latency and high-bandwidth applications and services. However, the distributed, dynamic, and heterogeneous nature of Edge computing, presents a significant challenge for service placement. A critical aspect of Edge computing involves managing the placement of applications within the network system to minimize each application's runtime, considering the resources available on system devices and the capabilities of the system's network. The placement of application images must be proactively planned to minimize image tranfer time, and meet the strict demands of the applications. In this regard, this paper proposes an approach for proactive image placement that combines Graph Neural Networks and actor-critic Reinforcement Learning, which is evaluated empirically and compared against various solutions. The findings indicate that although the proposed approach may result in longer execution times in certain scenarios, it consistently achieves superior outcomes in terms of application placement.
AIJul 14, 2021
TEACHING -- Trustworthy autonomous cyber-physical applications through human-centred intelligenceDavide Bacciu, Siranush Akarmazyan, Eric Armengaud et al.
This paper discusses the perspective of the H2020 TEACHING project on the next generation of autonomous applications running in a distributed and highly heterogeneous environment comprising both virtual and physical resources spanning the edge-cloud continuum. TEACHING puts forward a human-centred vision leveraging the physiological, emotional, and cognitive state of the users as a driver for the adaptation and optimization of the autonomous applications. It does so by building a distributed, embedded and federated learning system complemented by methods and tools to enforce its dependability, security and privacy preservation. The paper discusses the main concepts of the TEACHING approach and singles out the main AI-related research challenges associated with it. Further, we provide a discussion of the design choices for the TEACHING system to tackle the aforementioned challenges
DCMar 11, 2015
Tools and Models for High Level Parallel and Grid ProgrammingPatrizio Dazzi
When algorithmic skeletons were first introduced by Cole in late 1980 the idea had an almost immediate success. The skeletal approach has been proved to be effective when application algorithms can be expressed in terms of skeletons composition. However, despite both their effectiveness and the progress made in skeletal systems design and implementation, algorithmic skeletons remain absent from mainstream practice. Cole and other researchers, focused the problem. They recognized the issues affecting skeletal systems and stated a set of principles that have to be tackled in order to make them more effective and to take skeletal programming into the parallel mainstream. In this thesis we propose tools and models for addressing some among the skeletal programming environments issues. We describe three novel approaches aimed at enhancing skeletons based systems from different angles. First, we present a model we conceived that allows algorithmic skeletons customization exploiting the macro data-flow abstraction. Then we present two results about the exploitation of meta-programming techniques for the run-time generation and optimization of macro data-flow graphs. In particular, we show how to generate and how to optimize macro data-flow graphs accordingly both to programmers provided non-functional requirements and to execution platform features. The last result we present are the Behavioural Skeletons, an approach aimed at addressing the limitations of skeletal programming environments when used for the development of component-based Grid applications. We validated all the approaches conducting several test, performed exploiting a set of tools we developed.