DC AI DATA-ANFeb 28, 2025

Supporting the development of Machine Learning for fundamental science in a federated Cloud with the AI_INFN platform

Lucio Anderlini, Matteo Barbetti, Giulio Bianchini, Diego Ciangottini, Stefano Dal Pra, Diego Michelotto, Carmelo Pellegrino, Rosa Petrini, Alessandro Pascolini, Daniele Spiga

arXiv:2502.21266v21 citationsh-index: 100EPJ Web Conf

Originality Synthesis-oriented

AI Analysis

This work supports INFN researchers in adopting ML for fundamental science by providing tailored computing infrastructure, but it is incremental as it builds on existing cloud-native solutions.

The paper addresses the challenge of provisioning and orchestrating hardware accelerators for machine learning in scientific computing by developing a Kubernetes platform to facilitate GPU-powered data analysis workflows on federated cloud resources.

Machine Learning (ML) is driving a revolution in the way scientists design, develop, and deploy data-intensive software. However, the adoption of ML presents new challenges for the computing infrastructure, particularly in terms of provisioning and orchestrating access to hardware accelerators for development, testing, and production. The INFN-funded project AI_INFN ("Artificial Intelligence at INFN") aims at fostering the adoption of ML techniques within INFN use cases by providing support on multiple aspects, including the provision of AI-tailored computing resources. It leverages cloud-native solutions in the context of INFN Cloud, to share hardware accelerators as effectively as possible, ensuring the diversity of the Institute's research activities is not compromised. In this contribution, we provide an update on the commissioning of a Kubernetes platform designed to ease the development of GPU-powered data analysis workflows and their scalability on heterogeneous, distributed computing resources, possibly federated as Virtual Kubelets with the interLink provider.

View on arXiv PDF

Similar