Is Two-shot All You Need? A Label-efficient Approach for Video Segmentation in Breast Ultrasound
This work addresses label inefficiency for medical video segmentation, which is incremental as it builds on existing VOS methods with a novel training approach.
The paper tackles the problem of dense annotation requirements and accumulative errors in breast ultrasound video segmentation by proposing a two-shot training paradigm, achieving comparable performance to fully annotated methods with only 1.9% of training labels.
Breast lesion segmentation from breast ultrasound (BUS) videos could assist in early diagnosis and treatment. Existing video object segmentation (VOS) methods usually require dense annotation, which is often inaccessible for medical datasets. Furthermore, they suffer from accumulative errors and a lack of explicit space-time awareness. In this work, we propose a novel two-shot training paradigm for BUS video segmentation. It not only is able to capture free-range space-time consistency but also utilizes a source-dependent augmentation scheme. This label-efficient learning framework is validated on a challenging in-house BUS video dataset. Results showed that it gained comparable performance to the fully annotated ones given only 1.9% training labels.