How Developers Iterate on Machine Learning Workflows -- A Survey of the Applied Machine Learning Literature
This work addresses the need for quantitative evidence on iterative ML development to aid in creating human-in-the-loop systems, but it is incremental as it provides initial insights rather than a comprehensive analysis.
The authors conducted a small-scale survey of applied machine learning literature to quantitatively characterize iteration in ML workflow development, reporting preliminary trends and insights as a starting point for a benchmark.
Machine learning workflow development is anecdotally regarded to be an iterative process of trial-and-error with humans-in-the-loop. However, we are not aware of quantitative evidence corroborating this popular belief. A quantitative characterization of iteration can serve as a benchmark for machine learning workflow development in practice, and can aid the development of human-in-the-loop machine learning systems. To this end, we conduct a small-scale survey of the applied machine learning literature from five distinct application domains. We collect and distill statistics on the role of iteration within machine learning workflow development, and report preliminary trends and insights from our investigation, as a starting point towards this benchmark. Based on our findings, we finally describe desiderata for effective and versatile human-in-the-loop machine learning systems that can cater to users in diverse domains.