DBAIIRSep 28, 2025

ML-Asset Management: Curation, Discovery, and Utilization

arXiv:2509.23577v14 citationsh-index: 12Proc VLDB Endow
Originality Synthesis-oriented
AI Analysis

This is an incremental tutorial for researchers and practitioners facing challenges in managing ML assets like models and datasets.

This tutorial addresses the problem of underutilized machine learning assets due to fragmented documentation and siloed storage by providing a comprehensive overview of ML-asset management activities, including curation, discovery, and utilization, with live demonstrations of systems.

Machine learning (ML) assets, such as models, datasets, and metadata, are central to modern ML workflows. Despite their explosive growth in practice, these assets are often underutilized due to fragmented documentation, siloed storage, inconsistent licensing, and lack of unified discovery mechanisms, making ML-asset management an urgent challenge. This tutorial offers a comprehensive overview of ML-asset management activities across its lifecycle, including curation, discovery, and utilization. We provide a categorization of ML assets, and major management issues, survey state-of-the-art techniques, and identify emerging opportunities at each stage. We further highlight system-level challenges related to scalability, lineage, and unified indexing. Through live demonstrations of systems, this tutorial equips both researchers and practitioners with actionable insights and practical tools for advancing ML-asset management in real-world and domain-specific settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes