Life is not black and white -- Combining Semi-Supervised Learning with fuzzy labels
This addresses annotation variability issues for practitioners using semi-supervised learning, but it is incremental as it builds on existing methods without major breakthroughs.
The paper tackles the problem of inconsistent annotations (fuzzy labels) in semi-supervised learning, which can lead to inferior performance or higher costs, and proposes incorporating fuzzy labels to potentially reduce costs and improve consistency in the development cycle.
The required amount of labeled data is one of the biggest issues in deep learning. Semi-Supervised Learning can potentially solve this issue by using additional unlabeled data. However, many datasets suffer from variability in the annotations. The aggregated labels from these annotation are not consistent between different annotators and thus are considered fuzzy. These fuzzy labels are often not considered by Semi-Supervised Learning. This leads either to an inferior performance or to higher initial annotation costs in the complete machine learning development cycle. We envision the incorporation of fuzzy labels into Semi-Supervised Learning and give a proof-of-concept of the potential lower costs and higher consistency in the complete development cycle. As part of our concept, we discuss current limitations, futures research opportunities and potential broad impacts.