Multi-task learning on partially labeled datasets via invariant/equivariant semi-supervised learning
This work addresses the problem of training multi-task models (e.g., object detection and segmentation) when labeled data is scarce, offering a practical semi-supervised approach for computer vision tasks.
The paper investigates invariant and equivariant semi-supervised learning (FixMatch and Dense FixMatch) for multi-task models on partially labeled datasets. Results show these methods outperform supervised baselines, especially with fewer labels, with Dense FixMatch generally performing better.
We investigate the potential of invariant and equivariant semi-supervised learning for addressing the challenges of training multi-task models on partially labeled datasets with differently structured output tasks. Specifically, we use the popular FixMatch method for invariant semi-supervised learning and its equivariant extension Dense FixMatch. We evaluate their performance on the Cityscapes and BDD100K datasets in the context of the prevalent object detection and semantic segmentation tasks in computer vision. We consider varying sizes of the subsets annotated for each task and different overlaps among them. Our results for both invariant and equivariant semi-supervised learning outperform supervised baselines in most situations, with the most significant improvements observed when fewer labeled samples are available for a task and generally better results for the latter approach. Our study suggests that invariant/equivariant learning is a promising general direction for multi-task learning from limited labeled data.