From Limited Annotated Raw Material Data to Quality Production Data: A Case Study in the Milk Industry (Technical Report)
This addresses data acquisition challenges in industrial settings like dairy production, but it is incremental as it extends existing active learning methods.
The paper tackles the problem of building production outcome models with limited annotated raw material data by proposing an active learning methodology for regression, demonstrated in the milk industry to process milk into cottage cheese.
Industry 4.0 offers opportunities to combine multiple sensor data sources using IoT technologies for better utilization of raw material in production lines. A common belief that data is readily available (the big data phenomenon), is oftentimes challenged by the need to effectively acquire quality data under severe constraints. In this paper we propose a design methodology, using active learning to enhance learning capabilities, for building a model of production outcome using a constrained amount of raw material training data. The proposed methodology extends existing active learning methods to effectively solve regression-based learning problems and may serve settings where data acquisition requires excessive resources in the physical world. We further suggest a set of qualitative measures to analyze learners performance. The proposed methodology is demonstrated using an actual application in the milk industry, where milk is gathered from multiple small milk farms and brought to a dairy production plant to be processed into cottage cheese.