DBLGPLMay 15, 2023

Transactional Python for Durable Machine Learning: Vision, Challenges, and Feasibility

arXiv:2305.08770v1
Originality Incremental advance
AI Analysis

This addresses reliability issues for ML practitioners using Python, offering a non-intrusive solution to prevent wasted resources, though it is incremental as it builds on existing systems.

The paper tackles the problem of data loss in Python-based machine learning due to failures or errors by proposing Transactional Python to provide durability, atomicity, replicability, and time-versioning (DART) without code modifications, achieving overheads of 1.5% to 15.6% in a proof-of-concept implementation.

In machine learning (ML), Python serves as a convenient abstraction for working with key libraries such as PyTorch, scikit-learn, and others. Unlike DBMS, however, Python applications may lose important data, such as trained models and extracted features, due to machine failures or human errors, leading to a waste of time and resources. Specifically, they lack four essential properties that could make ML more reliable and user-friendly -- durability, atomicity, replicability, and time-versioning (DART). This paper presents our vision of Transactional Python that provides DART without any code modifications to user programs or the Python kernel, by non-intrusively monitoring application states at the object level and determining a minimal amount of information sufficient to reconstruct a whole application. Our evaluation of a proof-of-concept implementation with public PyTorch and scikit-learn applications shows that DART can be offered with overheads ranging 1.5%--15.6%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes