ROOct 3, 2021

Annotation Cost Reduction of Stream-based Active Learning by Automated Weak Labeling using a Robot Arm

arXiv:2110.00947v13 citations
Originality Incremental advance
AI Analysis

This work addresses annotation cost reduction for machine learning practitioners in industrial settings like object classification on conveyors, but it is incremental as it builds on existing stream-based active learning and self-training techniques.

The paper tackles the high human annotation cost in stream-based active learning by proposing a method that uses a robot arm for automated weak labeling, achieving the same or better performance as conventional methods while reducing human cost.

Stream-based active learning (AL) is an efficient training data collection method, and it is used to reduce human annotation cost required in machine learning. However, it is difficult to say that the human cost is low enough because most previous studies have assumed that an oracle is a human with domain knowledge. In this study, we propose a method to replace a part of the oracle's work in stream-based AL by self-training with weak labeling using a robot arm. A camera attached to a robot arm takes a series of image data related to a streamed object, which should have the same label. We use this information as a weak label to connect a pseudo-label (estimated class label) and a target instance. Our method selects two data from a series of image data; high confidence data for correcting pseudo-labels and low confidence data for improving the performance of the classifier. We paired a pseudo-label provided to high confidence data with a target instance (low confidence data). By using this technique, we mitigate the inefficiency in self-training, that is, difficulty in creating pseudo-labeled training data with a high impact on the target classifier. In the experiments, we employed the proposed method in the classification task of objects on a belt conveyor. We evaluated the performance against human cost on multiple scenarios considering the temporal variation of data. The proposed method achieves the same or better performance as the conventional methods while reducing human cost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes