SEAILGMar 25, 2021

Characterizing and Detecting Mismatch in Machine-Learning-Enabled Systems

arXiv:2103.14101v144 citations
AI Analysis

This addresses the problem of seamless deployment and operations in ML-enabled systems for developers and practitioners, but it is incremental as it builds on existing concerns about role integration.

The paper tackles the challenge of misalignment among data science, software engineering, and operations roles in ML-enabled systems, which leads to mismatches and system failures, by identifying and validating common mismatch types through interviews and surveys.

Increasing availability of machine learning (ML) frameworks and tools, as well as their promise to improve solutions to data-driven decision problems, has resulted in popularity of using ML techniques in software systems. However, end-to-end development of ML-enabled systems, as well as their seamless deployment and operations, remain a challenge. One reason is that development and deployment of ML-enabled systems involves three distinct workflows, perspectives, and roles, which include data science, software engineering, and operations. These three distinct perspectives, when misaligned due to incorrect assumptions, cause ML mismatches which can result in failed systems. We conducted an interview and survey study where we collected and validated common types of mismatches that occur in end-to-end development of ML-enabled systems. Our analysis shows that how each role prioritizes the importance of relevant mismatches varies, potentially contributing to these mismatched assumptions. In addition, the mismatch categories we identified can be specified as machine readable descriptors contributing to improved ML-enabled system development. In this paper, we report our findings and their implications for improving end-to-end ML-enabled system development.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes