nuts-flow/ml: data pre-processing for deep learning
This framework addresses the problem of inefficient data preprocessing for deep learning practitioners, but it is incremental as it builds on existing pipeline concepts.
The authors tackled the time-consuming nature of data preprocessing in deep learning by introducing nuts-flow/ml, a software framework that encapsulates common operations into flexible components, enabling rapid construction of efficient preprocessing pipelines.
Data preprocessing is a fundamental part of any machine learning application and frequently the most time-consuming aspect when developing a machine learning solution. Preprocessing for deep learning is characterized by pipelines that lazily load data and perform data transformation, augmentation, batching and logging. Many of these functions are common across applications but require different arrangements for training, testing or inference. Here we introduce a novel software framework named nuts-flow/ml that encapsulates common preprocessing operations as components, which can be flexibly arranged to rapidly construct efficient preprocessing pipelines for deep learning.