Model Lake: a New Alternative for Machine Learning Models Management and Governance
This addresses the need for standardized ML model management across industries, though it appears incremental as it builds on existing data lake concepts.
The paper tackles the problem of managing and governing machine learning models by proposing Model Lake, a centralized framework inspired by data lakes, which enhances model lifecycle management, discovery, audit, and reusability in organizational environments.
The rise of artificial intelligence and data science across industries underscores the pressing need for effective management and governance of machine learning (ML) models. Traditional approaches to ML models management often involve disparate storage systems and lack standardized methodologies for versioning, audit, and re-use. Inspired by data lake concepts, this paper develops the concept of ML Model Lake as a centralized management framework for datasets, codes, and models within organizations environments. We provide an in-depth exploration of the Model Lake concept, delineating its architectural foundations, key components, operational benefits, and practical challenges. We discuss the transformative potential of adopting a Model Lake approach, such as enhanced model lifecycle management, discovery, audit, and reusability. Furthermore, we illustrate a real-world application of Model Lake and its transformative impact on data, code and model management practices.