Coarse-to-Fine Covid-19 Segmentation via Vision-Language Alignment
This addresses the challenge of limited data and annotations for COVID-19 segmentation, aiding physicians in diagnosis and treatment, but it appears incremental as it builds on existing segmentation and vision-language methods.
The paper tackles the problem of COVID-19 lesion segmentation by proposing C2FVL, a coarse-to-fine framework that aligns vision and language to incorporate text information, resulting in outperforming state-of-the-art methods on chest X-ray and CT datasets.
Segmentation of COVID-19 lesions can assist physicians in better diagnosis and treatment of COVID-19. However, there are few relevant studies due to the lack of detailed information and high-quality annotation in the COVID-19 dataset. To solve the above problem, we propose C2FVL, a Coarse-to-Fine segmentation framework via Vision-Language alignment to merge text information containing the number of lesions and specific locations of image information. The introduction of text information allows the network to achieve better prediction results on challenging datasets. We conduct extensive experiments on two COVID-19 datasets including chest X-ray and CT, and the results demonstrate that our proposed method outperforms other state-of-the-art segmentation methods.